Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: issue-1028, fetch models when user enters api key #3251

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

normunds-wipo
Copy link
Contributor

@normunds-wipo normunds-wipo commented Jul 3, 2024

Summary

Solution is intended only for custom configurations. Each time the user enters api key from frontend, it reloads models, including from the endpoint using the new key.

Relevant changes:

  • in frontend api layer, update of user key invalidates models, causing to re-load them when next used
  • in modelscontroller, get method does not serve from cache anymore, but each time called, retrieves the models
  • in loadConfigModels, if api_key is defined as user_provided, check if user has a valid key and use it to fetch models

It might be working also for default endpoints, as now each time key has been entered, the models get refreshed, but this is not tested.

Change Type

Testing

Tested just by running code. We are only using custom configuration

Checklist

Please delete any irrelevant options.

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • Local unit tests pass with my changes (well tests related to tokenSplit fail with or without my code)
@normunds-wipo normunds-wipo marked this pull request as ready for review July 3, 2024 11:44
const customModelsConfig = await loadConfigModels(req);

const modelConfig = { ...defaultModelsConfig, ...customModelsConfig };
const modelConfig = { ...(await loadDefaultModels(req)), ...(await loadConfigModels(req)) };
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this change, the cache is never used for models. this will lead to slow page loads every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this it will not work. Now what?

The cache is never used when we get /api/models request from UI. Right, and that happens only in two cases:

  • first load of UI (this is your worry - we do not need to re-query endpoints if the key is fixed)
  • after user key update (this is kind of point that we want to obtain)
    Cache is used if some other part of code requests models using getModelsConfig()

We can say that all models retrieved with not-user-provided key stay the same between different users. In this case we could to split endpoints using fixed keys and user defined.

Then maybe we could cache fixed set and "user-provided". I guess it could not be simply between default models and custom models as both can use both fixed and user provided keys, right?

On the first glance the query of models are not easy to split up in this way. Probably doable, but it feels hardly worth.

Probably we could use caching of models per endpoint+key. Then we will need to do only 1 request - for the endpoint user just changed the key. Sounds the best for me. But this involves changes across all endpoints/providers.

Copy link
Contributor Author

@normunds-wipo normunds-wipo Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking a bit deeper, it looks like we cannot cache ALL models (as a block) like it is currently done in ModelController if we need to allow different users to have a different set of models.
User key that you can enter from UI implies exactly that. Say we have a custom configuration with a user provided key

  • you get key with 5 models, you log in, enter the key, we cache N+5 models
  • I get key with 2 models, I log in, enter the key, we cache N+2 models
  • you do some operation, that requires model validation - if validation uses the cache, your operation will be validated against my 2 models :-(

The solution would be to cache models per url+key, and whenever we need a list of models, rebuild all the model set (and not use the fixed one in CONFIG_STORE). It looks like hardly any models get fetched from a remote endpoint except ones in custom config (and ones loaded via getOpenAIModels, that already caches models per endpoint)
So all the changes will go into loadConfigModels (and I will remove any usage of cache in ModelsController) I will shortly update PR with these changes.
There is also some draft code in overrideController() that intends to update the cache; this will need to aligned if/whenever implemented.

@mkagit
Copy link

mkagit commented Sep 12, 2024

Is it working without issues @normunds-wipo ?

@normunds-wipo
Copy link
Contributor Author

Yes. I just merged the main branch and had to adjust for encryption/decryption utility return value change (from string to Promise), but else we are using this code for a couple of months without problems.

@mkagit
Copy link

mkagit commented Sep 12, 2024

Yes. I just merged the main branch and had to adjust for encryption/decryption utility return value change (from string to Promise), but else we are using this code for a couple of months without problems.

I'll try it. What's your case for the reloading fetch?
I was looking for to, fetch through custom "user_provided" key, models from the LiteLLM virtual keys proxy server.

@normunds-wipo
Copy link
Contributor Author

The same, query models from LiteLLM. Different users have different keys and potentially different set of models. So we cannot cache all models and need to query models by user. Also once the user enters the key we need to reload models corresponding to the new key. It seems it is doing it correctly.

@dansavu
Copy link

dansavu commented Sep 16, 2024

This is a useful feature, looking forward to see the PR approved and merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants