Releases · cpumaxx/llama.cpp

06 May 01:08

628b299

b2794 Latest

Latest

Adding support for the --numa argument for llama-bench. (#7080)

Assets 19

cudart-llama-bin-win-cu11.7.1-x64.zip

293 MB 2024-05-06T01:08:10Z
cudart-llama-bin-win-cu12.2.0-x64.zip

413 MB 2024-05-06T01:08:19Z
llama-b2794-bin-macos-arm64.zip

39.1 MB 2024-05-06T01:08:32Z
llama-b2794-bin-macos-x64.zip

35.7 MB 2024-05-06T01:08:34Z
llama-b2794-bin-ubuntu-x64.zip

44.1 MB 2024-05-06T01:08:36Z
llama-b2794-bin-win-arm64-x64.zip

5.85 MB 2024-05-06T01:08:39Z
llama-b2794-bin-win-avx-x64.zip

6.38 MB 2024-05-06T01:08:40Z
llama-b2794-bin-win-avx2-x64.zip

6.37 MB 2024-05-06T01:08:41Z
llama-b2794-bin-win-avx512-x64.zip

6.38 MB 2024-05-06T01:08:42Z
llama-b2794-bin-win-clblast-x64.zip

7.56 MB 2024-05-06T01:08:43Z
Source code (zip)

2024-05-05T12:17:47Z
Source code (tar.gz)

2024-05-05T12:17:47Z

05 May 07:17

github-actions

b2792

ca36326

b2792

readme : add note that LLaMA 3 is not supported with convert.py (#7065)

Assets 19

29 Apr 00:50

github-actions

b2755

e00b4a8

b2755

Fix more int overflow during quant (PPL/CUDA). (#6563)

* Fix more int overflow during quant.

* Fix some more int overflow in softmax.

* Revert back to int64_t.

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: cpumaxx/llama.cpp

b2794

b2792

b2755