-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to load *ANY BASE MODEL* in 4bit #78
Comments
According to this document BitsAndBytesConfig, all of the linear layers will be replaced by A temporary solution is to initialize a unquantified model, load projector weights, and save the whole model weights. The saved weights can be loaded successfully with
|
Hello,I'm a phD student from ZJU, I also use videollama2 to do my own research,we create a WeChat group to discuss some issues of videollama2 and help each other,could you join us? Please contact me: WeChat number == LiangMeng19357260600, phone number == +86 19357260600,e-mail == [email protected]. |
Hi VideoLLaMA Team,
I am facing issues while loading all the base models in 4-bit precision. The following lines try to load the
mm_projector_weights
which are stored in 16-bit precision into a model that requires the weights in 4bit leading to errors:Code used for loading the models for inference
Problematic part of the Code:
Lines: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/__init__.py#L171-L172
Error:
How can we use the 16-bit stored weights of the
mm_projector_weights
in 4-bit models?The text was updated successfully, but these errors were encountered: