Releases · ggerganov/llama.cpp

18 Dec 01:15

4da69d1

b4351 Latest

Latest

Revert "llama : add Falcon3 support (#10864)" (#10876)

This reverts commit 382bc7f2e8ffd0b89f23e840d097e21f301197ba.

Assets 23

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2024-12-18T01:15:10Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2024-12-18T01:15:17Z
llama-b4351-bin-macos-arm64.zip

57 MB 2024-12-18T01:15:24Z
llama-b4351-bin-macos-x64.zip

58.6 MB 2024-12-18T01:15:26Z
llama-b4351-bin-ubuntu-x64.zip

65.2 MB 2024-12-18T01:15:27Z
llama-b4351-bin-win-avx-x64.zip

9.46 MB 2024-12-18T01:15:29Z
llama-b4351-bin-win-avx2-x64.zip

9.47 MB 2024-12-18T01:15:30Z
llama-b4351-bin-win-avx512-x64.zip

9.48 MB 2024-12-18T01:15:30Z
llama-b4351-bin-win-cuda-cu11.7-x64.zip

147 MB 2024-12-18T01:15:31Z
llama-b4351-bin-win-cuda-cu12.4-x64.zip

146 MB 2024-12-18T01:15:34Z
Source code (zip)

2024-12-18T00:36:46Z
Source code (tar.gz)

2024-12-18T00:36:46Z

17 Dec 23:17

github-actions

b4350

d62b532

b4350

Use model->gguf_kv for loading the template instead of using the C AP…

Assets 23

17 Dec 21:24

github-actions

b4349

081b29b

b4349

tests: add tests for GGUF (#10830)

Assets 23

17 Dec 20:27

github-actions

b4348

5437d4a

b4348

sync : ggml

Assets 23

17 Dec 20:19

github-actions

b4343

0006f5a

b4343

ggml : update ggml_backend_cpu_device_supports_op (#10867)

* ggml : fix cpy op for IQ-quants to use reference impl

ggml-ci

* ggml : disable tests involving i-matrix quantization

* ggml : update ggml_backend_cpu_device_supports_op

ggml-ci

Assets 23

17 Dec 18:54

github-actions

b4342

05c3a44

b4342

server : fill usage info in embeddings and rerank responses (#10852)

* server : fill usage info in embeddings response

* server : fill usage info in reranking response

Assets 23

17 Dec 16:31

github-actions

b4341

382bc7f

b4341

llama : add Falcon3 support (#10864)

Assets 23

17 Dec 06:30

github-actions

b4338

7b1ec53

b4338

vulkan: bugfixes for small subgroup size systems + llvmpipe test (#10…

Assets 23

16 Dec 21:41

github-actions

b4337

160bc03

b4337

rwkv6: add wkv6 support for Vulkan backend (#10829)

* rwkv_wkv6 vulkan shader

* RWKV_WKV6 Vulkan op tests passed

Signed-off-by: Molly Sophia <[email protected]>

* Apply code format changes

Signed-off-by: Molly Sophia <[email protected]>

* add [[unroll]] and remove unnecessary conditions

* add uma support

* fix erros in EditorConfig Checker

---------

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Molly Sophia <[email protected]>

Assets 23

15 Dec 17:41

github-actions

b4333

a097415

b4333

llama : add Deepseek MoE v1 & GigaChat models (#10827)

* Add deepseek v1 arch & gigachat template

* improve template code

* add readme

* delete comments

* remove comment

* fix format

* lint llama.cpp

* fix order of deepseek and deepseek2, move gigachat temlate to the end of func

* fix order of deepseek and deepseek2 in constants; mark shared exp as deepseek arch need

* remove comments

* move deepseek above deepseek2

* change placement of gigachat chat template

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b4351

b4350

b4349

b4348

b4343

b4342

b4341

b4338

b4337

b4333