-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eval bug: Qwen2-VL Hallucinates image content on Vulkan backend #10843
Comments
Could you do a quick test and see if it works with an F16 vision projector: .\build\bin\Release\llama-quantize.exe .\models\mmproj-Qwen2-VL-7B-Instruct-f32.gguf .\models\mmproj-Qwen2-VL-7B-Instruct-f16.gguf f16
.\build\bin\Release\llama-qwen2vl-cli.exe -m .\models\Qwen2-VL-7B-Instruct-IQ4_NL.gguf --mmproj .\models\mmproj-Qwen2-VL-7B-Instruct-f16.gguf -p 'What could be the context of this image.' --image '.\Pictures\Untitled.png' --seed 0 --temp 0 -ngl 99 |
It's not working :(
stable-diffusion.cpp's cli does allow me convert it to f16, but I think its strips off important metadata:
|
Ah, I think you have to use the surgery script: python ./examples/llava/qwen2_vl_surgery.py Qwen/Qwen2-VL-2B-Instruct --data_type fp16 |
It's the same mmproj for the 2b and the 7B model? |
It seems not |
CPU:
Vulkan (ngl 99):
Still not working |
Can you try enabling GGML_VULKAN_CHECK_RESULTS and see if it identifies the broken op? You might need to manually add the cpu backend source files to ggml-vulkan (I think this broke when the backends were refactored). |
|
To fix those linker issues you need to add the ggml-cpu sources to ggml-vulkan. |
Building with |
|
I can confirm this issue happens even with no layers offloaded. On CPU backend it works fine. Model is BF16, projector F16. Same assert as above. |
Name and Version
.\build\bin\Release\llama-cli.exe --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 64 | matrix cores: none
version: 4329 (89d604f)
built with MSVC 19.41.34120.0 for x64
Operating systems
Windows
GGML backends
Vulkan
Hardware
Ryzen 5900X +RX 5700 XT
Models
Qwen2-VL-7B-Instruct-IQ4_NL + mmproj-Qwen2-VL-7B-Instruct-f32
Problem description & steps to reproduce
When I run it on Vulkan build, the description given by the model has nothing to do with the image given as argument (no matter the
-ngl
value, even-ngl 0
is broken). The exact same setup works perfectly fine on CPU backend.I know the Vulkan backend doesn't support Qwen2-VL yet, but according to #10361 (comment), this should only cause slowdowns, not invalid outputs.
Relevant log output
Image input:
-ngl 0
-ngl 99
CPU backend for comparison
The text was updated successfully, but these errors were encountered: