-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: issue-1028, fetch models when user enters api key #3251
base: main
Are you sure you want to change the base?
Conversation
const customModelsConfig = await loadConfigModels(req); | ||
|
||
const modelConfig = { ...defaultModelsConfig, ...customModelsConfig }; | ||
const modelConfig = { ...(await loadDefaultModels(req)), ...(await loadConfigModels(req)) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this change, the cache is never used for models. this will lead to slow page loads every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this it will not work. Now what?
The cache is never used when we get /api/models request from UI. Right, and that happens only in two cases:
- first load of UI (this is your worry - we do not need to re-query endpoints if the key is fixed)
- after user key update (this is kind of point that we want to obtain)
Cache is used if some other part of code requests models using getModelsConfig()
We can say that all models retrieved with not-user-provided key stay the same between different users. In this case we could to split endpoints using fixed keys and user defined.
Then maybe we could cache fixed set and "user-provided". I guess it could not be simply between default models and custom models as both can use both fixed and user provided keys, right?
On the first glance the query of models are not easy to split up in this way. Probably doable, but it feels hardly worth.
Probably we could use caching of models per endpoint+key. Then we will need to do only 1 request - for the endpoint user just changed the key. Sounds the best for me. But this involves changes across all endpoints/providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking a bit deeper, it looks like we cannot cache ALL models (as a block) like it is currently done in ModelController if we need to allow different users to have a different set of models.
User key that you can enter from UI implies exactly that. Say we have a custom configuration with a user provided key
- you get key with 5 models, you log in, enter the key, we cache N+5 models
- I get key with 2 models, I log in, enter the key, we cache N+2 models
- you do some operation, that requires model validation - if validation uses the cache, your operation will be validated against my 2 models :-(
The solution would be to cache models per url+key, and whenever we need a list of models, rebuild all the model set (and not use the fixed one in CONFIG_STORE). It looks like hardly any models get fetched from a remote endpoint except ones in custom config (and ones loaded via getOpenAIModels, that already caches models per endpoint)
So all the changes will go into loadConfigModels (and I will remove any usage of cache in ModelsController) I will shortly update PR with these changes.
There is also some draft code in overrideController() that intends to update the cache; this will need to aligned if/whenever implemented.
Is it working without issues @normunds-wipo ? |
Yes. I just merged the main branch and had to adjust for encryption/decryption utility return value change (from string to Promise), but else we are using this code for a couple of months without problems. |
I'll try it. What's your case for the reloading fetch? |
The same, query models from LiteLLM. Different users have different keys and potentially different set of models. So we cannot cache all models and need to query models by user. Also once the user enters the key we need to reload models corresponding to the new key. It seems it is doing it correctly. |
This is a useful feature, looking forward to see the PR approved and merged. |
Summary
Solution is intended only for custom configurations. Each time the user enters api key from frontend, it reloads models, including from the endpoint using the new key.
Relevant changes:
It might be working also for default endpoints, as now each time key has been entered, the models get refreshed, but this is not tested.
Change Type
user_provided
API KEY value is used, fetch models for user #1028, fetch user models when user provides API_KEYTesting
Tested just by running code. We are only using custom configuration
Checklist
Please delete any irrelevant options.