-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FR: Phi-3-vision-128k-instruct implementation #7444
Comments
Is it natively supported once someone converts it to gguf? |
Someone has to write the code to run such a model into llama.cpp. Then it would be a model you could convert to gguf. Until then, no. |
I'm waiting who will do that patiently ...😭 |
I've tried to convert the phi-3-vision-128k-instruct HF model to the GGUF model. But it looks like the current version llama.cpp does not support the vision model (model.vision_embed_tokens, etc.) in phi-3v. After I add "Phi3VForCausalLM" into the convert-hf-to-gguf.py just copy from "Phi3ForCausalLM", the running result looks like below: ... The tensors' names like 'model.vision_embed_tokens.glb_GN' are not listed in the "TensorNameMap" of the tensor_mapping.py file. These additional models in the Phi-3v can be found here: Is that possible to make llama.cpp support multimodel like llava and Phi-3v? |
The model is very good for its size for the OCR task, looking forward to use it in GGUF format |
Hi @ggerganov, the Phi-3 vision is similar to llava, combined with Phi-3 and CLIP-ViT-Large-patch14-336 models. Is possible to support converting it from HF to GGUF? |
Any update on the convert-hf-to-gguf issue on the Phi3-vision-small-128k model? Seems to be giving the same error as above:
|
You can copy "Phi3ForCausalLM" section and add as "Phi3VForCausalLM" in this python file. But Phi3-vision-128k-instruct includes Phi3 and Clip model. Phi3 can be detected and converted, but Clip model can't be converted via convert-hf-to-gguf.py code. It prompt the tensor mapping fail. |
I did exactly that, as mentioned in the messages above in this issue. And got the exact same problem. Any sort of workarounds for this - if we can somehow decouple them or something? |
You can use |
#7705 👁️ |
Would it be possible to use a parameter in the GGUF header to tell it that the file contains two sets of tensor data? I feel like for the typical user they will expect to use a single GGUF file. |
bad bot |
sad but true |
New release of Phi-3.5-vision-instruct today: https://huggingface.co/microsoft/Phi-3.5-vision-instruct (As well as a 16x3.8B MoE and an updated version of the basic Phi-3.5-mini) |
+1 for support |
@coder543 And it can be converted to GGUF? and use VISION model? |
@Milor123 Nope… that’s why this issue exists. |
Abetlen already did convert it and tries to create an experimental branch: https://huggingface.co/abetlen/Phi-3.5-vision-instruct-gguf |
code to use Phi-3.5-vision-instruct-gguf with image locally on llama cpp python???????????? |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Is the issue closed, or the bot closed anyway? Regards, |
That model is insane for its size ....
https://huggingface.co/microsoft/Phi-3-vision-128k-instruct
The text was updated successfully, but these errors were encountered: