Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurred: The checkpoint you are trying to load has model type qwen2_vl but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date. #114

Open
kicks66 opened this issue Sep 23, 2024 · 1 comment

Comments

@kicks66
Copy link

kicks66 commented Sep 23, 2024

Im getting the following error when using the vLLM template

An error occurred: The checkpoint you are trying to load has model type qwen2_vl but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

I believe its because the latest version of transformers is required:

pip install git+https://github.com/huggingface/transformers accelerate

Is it possible to install this over the top?

@cris-almodovar
Copy link

try using the qwenllm/qwenvl:latest container image and a docker cmd similar to this:

python -m vllm.entrypoints.openai.api_server  --served-model-name Qwen2-VL-72B-Instruct-GPTQ-Int4 --model Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4 --dtype float16 --gpu-memory-utilization 0.8 --tensor-parallel-size 2 --trust-remote-code --max-model-len 8192 --limit-mm-per-prompt image=5,video=1

the above works for me when i create pods (each worker has 2 x A40) . serverless endpoints don't work, sadly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants