Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : remove K_QUANTS_PER_ITERATION macro #9034

Closed
wants to merge 2 commits into from

Conversation

ggerganov
Copy link
Owner

Aways use a value of 2

@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Aug 15, 2024
@jeroen-mostert
Copy link
Contributor

The docs (which are not updated in this PR, by the way) claim that "setting this value to 1 can improve performance for slow GPUs". Is this no longer true? (It doesn't help that no mention is made which GPUs these are supposed to be, as in, class or generation).

@ggerganov ggerganov force-pushed the gg/remove-k-quants-per-iter branch from 943f851 to ccb4518 Compare August 26, 2024 06:52
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Aug 26, 2024
@ggerganov
Copy link
Owner Author

Not sure, I just haven't noticed this compile option to be used so I think it is not worth keeping the extra code paths from maintenance PoV

@jeroen-mostert
Copy link
Contributor

Eh, I suppose it's easy enough to restore if someone who does see performance gains from it complains. It would also be interesting to know who (or rather what) benefits from tweaking GGML_CUDA_DMMV_X and GGML_CUDA_MMV_Y and how much. On my RX 6800 XT varying these has almost no effect at all, but of course that's only one RDNA2 device.

@ggerganov ggerganov closed this Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants