vulkan: bugfixes for small subgroup size systems + llvmpipe test #10809

netrunnereve · 2024-12-12T21:29:03Z

So I tried running our Vulkan implementation with llvmpipe for fun and discovered that some shaders don't work properly with a smaller than usual subgroup size. Llvmpipe has a subgroup size of 8 on AVX systems as it treats each 256-bit AVX processor (computing 8 32 bit floats) as a GPU core.

Having llvmpipe support also means that we can run the Vulkan tests on the regular Github CI machines. It's super slow considering it's simulating a GPU on CPU but at least it's faster than the CUDA and HIP builds!

Provide more documentation for streaming mode.

more fixes add test

ggml/src/ggml-vulkan/ggml-vulkan.cpp

jeffbolznv

I can't review the workflow change, but the vulkan change LGTM.

slaren

The workflow change looks good to me, it doesn't add an unreasonable amount of time to the CI.

0cc4m · 2024-12-16T21:14:40Z

I tried running it on llvmpipe, but it got stuck on MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]) and didn't continue. Not sure what's going on. I'll try again tomorrow.

netrunnereve · 2024-12-17T00:47:49Z

Not sure if this helps but here's the Vulkan printout from my computer which passes all tests. The flags are the same on the Github machines but they use version 15.0.7.

ggml_vulkan: 0 = llvmpipe (LLVM 17.0.6, 256 bits) (llvmpipe) | uma: 0 | fp16: 1 | warp size: 8 | matrix cores: none

I think if you have a CPU with AVX 512 you'll get a 512 bit llvmpipe with a warp size of 16.

0cc4m · 2024-12-17T05:50:38Z

I don't have an AVX512 CPU. I get exactly the same printout as you do. When using Mesa 24.3.1 - kisak-mesa PPA (LLVM 17.0.6) it gets stuck on q4_1. When using Mesa 24.0.9-0ubuntu0.3 (LLVM 17.0.6) it doesn't. Not sure what's going on, but it shouldn't affect us.

CentricStorm and others added 3 commits December 11, 2024 22:55

docs: update server streaming mode documentation (ggerganov#9519)

ce70d91

Provide more documentation for streaming mode.

ensure mul mat shaders work on systems with subgroup size less than 32

064360a

more fixes add test

Merge branch 'ggerganov:master' into vulkan_llvmpipe

6b6f756

netrunnereve requested a review from 0cc4m December 12, 2024 21:29

github-actions bot added Vulkan Issues specific to the Vulkan backend devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning labels Dec 12, 2024

jeffbolznv reviewed Dec 13, 2024

View reviewed changes

ggml/src/ggml-vulkan/ggml-vulkan.cpp Show resolved Hide resolved

ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated Show resolved Hide resolved

netrunnereve added 2 commits December 14, 2024 01:33

Merge branch 'master' into vulkan_llvmpipe

6110a9b

only s_warptile_mmq needs to be run with 32 threads or more

663716c

jeffbolznv approved these changes Dec 16, 2024

View reviewed changes

slaren approved these changes Dec 16, 2024

View reviewed changes

0cc4m approved these changes Dec 17, 2024

View reviewed changes

0cc4m merged commit 7b1ec53 into ggerganov:master Dec 17, 2024
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vulkan: bugfixes for small subgroup size systems + llvmpipe test #10809

vulkan: bugfixes for small subgroup size systems + llvmpipe test #10809

netrunnereve commented Dec 12, 2024

jeffbolznv left a comment

slaren left a comment

0cc4m commented Dec 16, 2024

netrunnereve commented Dec 17, 2024

0cc4m commented Dec 17, 2024

vulkan: bugfixes for small subgroup size systems + llvmpipe test #10809

vulkan: bugfixes for small subgroup size systems + llvmpipe test #10809

Conversation

netrunnereve commented Dec 12, 2024

jeffbolznv left a comment

Choose a reason for hiding this comment

slaren left a comment

Choose a reason for hiding this comment

0cc4m commented Dec 16, 2024

netrunnereve commented Dec 17, 2024

0cc4m commented Dec 17, 2024