Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformers backend: Auto-downloaded model can not be used #3594

Open
mcd01 opened this issue Sep 18, 2024 · 0 comments
Open

Transformers backend: Auto-downloaded model can not be used #3594

mcd01 opened this issue Sep 18, 2024 · 0 comments
Labels
bug Something isn't working unconfirmed

Comments

@mcd01
Copy link

mcd01 commented Sep 18, 2024

I am facing an issue when using the transformers backend with huggingface models and was hoping that someone can provide additional insights. I am quite sure it is just a small thing on my end, but I already tried quite a bunch of combinations and none of them worked.

LocalAI version:
v2.20.1 with localai/localai:latest-gpu-nvidia-cuda-12

Environment, CPU architecture, OS, and Version:
TBD

Describe the bug
When trying to interact with the model downloaded by the backend, I get the following error:

{
  "error": {
    "code": 500,
    "message": "could not load model (no success): Unexpected err=OSError(\"Can't load the model for 'facebook/opt-125m'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/opt-125m' is the correct path to a directory containing a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.\"), type(err)=<class 'OSError'>",
    "type": ""
  }

The model is apparently successfully downloaded into the models directory. The model path is also correctly configured (e.g. manually downloaded .gguf models can be used with llama-cpp backend). It is just that the docs on GPT with transformers backend seem to be out-of-date, at least I can not reproduce their results.

To Reproduce
Work through the docs on GPT for the transformers backend and try to implement the example.

name: transformers
backend: transformers
parameters:
    model: "facebook/opt-125m"
type: AutoModelForCausalLM
quantization: bnb_4bit

Make sure the backend is configured and correctly loaded:

EXTRA_BACKENDS: backend/python/transformers
Preparing backend: backend/python/transformers
make: Entering directory '/build/backend/python/transformers'
bash install.sh
Initializing libbackend for transformers
virtualenv activated
activated virtualenv has been ensured
starting requirements install for /build/backend/python/transformers/requirements.txt
Audited 4 packages in 73ms
finished requirements install for /build/backend/python/transformers/requirements.txt
make: Leaving directory '/build/backend/python/transformers'

Expected behavior
As per the docs, I would have expected the model to be downloaded (which apparently happens) and be correctly detected, loaded, and used by the respective backend.

Logs
None

Additional context
I also carefully read this issue (and posted a detailed comment there) but it didn't solve my problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
1 participant