Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Vulkan Quantized Matrix Vector Multiplication on AMD GPUs when ncols < 64 #8855

Merged
merged 2 commits into from
Aug 5, 2024

Conversation

0cc4m
Copy link
Collaborator

@0cc4m 0cc4m commented Aug 4, 2024

I fixed Vulkan quantized matrix vector multiplication test failure on AMD GPUs (warp size 64) when there are not enough blocks to fill the warp. This was caught by the tests added in #8800 , but I noticed that for k-quants they run the same test twice, so I added a check whether the new test is actually required. Let me know if that's okay.

@github-actions github-actions bot added the testing Everything test related label Aug 4, 2024
@0cc4m 0cc4m changed the title 0cc4m/vulkan fix mmv tests Fix Quantized Matrix Vector Multiplication on AMD GPUs when ncols < 64 Aug 4, 2024
@JohannesGaessler JohannesGaessler added the Vulkan Issues specific to the Vulkan backend label Aug 4, 2024
@0cc4m 0cc4m changed the title Fix Quantized Matrix Vector Multiplication on AMD GPUs when ncols < 64 Fix Vulkan Quantized Matrix Vector Multiplication on AMD GPUs when ncols < 64 Aug 5, 2024
@ggerganov ggerganov merged commit 064cdc2 into master Aug 5, 2024
54 checks passed
@0cc4m 0cc4m deleted the 0cc4m/vulkan-fix-mmv-tests branch August 5, 2024 06:03
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Aug 7, 2024
…ov#8855)

* Fix Vulkan mul mat vec invalid results when ncols < warp size

* Only run backend ops mul mat vec block size test if block size not already covered
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Everything test related Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants