Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b4351
b4350
Use model->gguf_kv for loading the template instead of using the C AP…
b4349
tests: add tests for GGUF (#10830)
b4348
sync : ggml
b4343
ggml : update ggml_backend_cpu_device_supports_op (#10867) * ggml : fix cpy op for IQ-quants to use reference impl ggml-ci * ggml : disable tests involving i-matrix quantization * ggml : update ggml_backend_cpu_device_supports_op ggml-ci
b4342
server : fill usage info in embeddings and rerank responses (#10852) * server : fill usage info in embeddings response * server : fill usage info in reranking response
b4341
llama : add Falcon3 support (#10864)
b4338
vulkan: bugfixes for small subgroup size systems + llvmpipe test (#10…
b4337
rwkv6: add wkv6 support for Vulkan backend (#10829) * rwkv_wkv6 vulkan shader * RWKV_WKV6 Vulkan op tests passed Signed-off-by: Molly Sophia <[email protected]> * Apply code format changes Signed-off-by: Molly Sophia <[email protected]> * add [[unroll]] and remove unnecessary conditions * add uma support * fix erros in EditorConfig Checker --------- Signed-off-by: Molly Sophia <[email protected]> Co-authored-by: Molly Sophia <[email protected]>
b4333
llama : add Deepseek MoE v1 & GigaChat models (#10827) * Add deepseek v1 arch & gigachat template * improve template code * add readme * delete comments * remove comment * fix format * lint llama.cpp * fix order of deepseek and deepseek2, move gigachat temlate to the end of func * fix order of deepseek and deepseek2 in constants; mark shared exp as deepseek arch need * remove comments * move deepseek above deepseek2 * change placement of gigachat chat template