sync : llama.cpp #1047

ggerganov · 2024-12-12T16:18:30Z

No description provided.

There are some bugs in the 1.3.296 SDK, so disable this. It isn't strictly necessary anyway. Add missing dependency on vulkan-shaders-gen, so shaders get recompiled when it changes. Fix coopmat support reporting when glslc doesn't support NV_coopmat2.

* Renames NVIDIA GPU-architecture flags to avoid name clashes with WinAPI. (e.g. CC_PASCAL, GPU architecture or WinAPI pascal compiler flag?) * Reverts erroneous rename in SYCL-code. * Renames GGML_CUDA_MIN_CC_DP4A to GGML_CUDA_CC_DP4A. * Renames the rest of the compute capability macros for consistency.

* q5_k q4_k q3_k q2_k q6_k multi row example * revert as multi row isnt faster for k quants

Vulkan doesn't mandate a specific rounding mode, but the shader_float_controls feature allows rounding mode to be requested if the implementation supports it.

* feat: load all backends from a user-provided search path * fix: Windows search path * refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path` * refactor: rename `search_path` to `dir_path` * fix: change `NULL` to `nullptr` Co-authored-by: Diego Devesa <[email protected]> * fix: change `NULL` to `nullptr` --------- Co-authored-by: Diego Devesa <[email protected]>

ggml-ci

jeffbolznv and others added 6 commits December 12, 2024 18:16

vulkan: dynamic subgroup size for the remaining k quants (llama/10745)

2653d31

* q5_k q4_k q3_k q2_k q6_k multi row example * revert as multi row isnt faster for k quants

vulkan: request round-to-even for fp16 in im2col/rope_head (llama/10767)

e0d06b7

Vulkan doesn't mandate a specific rounding mode, but the shader_float_controls feature allows rounding mode to be requested if the implementation supports it.

sync : llama.cpp

b374c32

ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync : llama.cpp #1047

sync : llama.cpp #1047

ggerganov commented Dec 12, 2024

sync : llama.cpp #1047

Are you sure you want to change the base?

sync : llama.cpp #1047

Conversation

ggerganov commented Dec 12, 2024