Skip to content

Releases: cpumaxx/llama.cpp

b2794

06 May 01:08
628b299
Compare
Choose a tag to compare
Adding support for the --numa argument for llama-bench. (#7080)

b2792

05 May 07:17
ca36326
Compare
Choose a tag to compare
readme : add note that LLaMA 3 is not supported with convert.py (#7065)

b2755

29 Apr 00:50
e00b4a8
Compare
Choose a tag to compare
Fix more int overflow during quant (PPL/CUDA). (#6563)

* Fix more int overflow during quant.

* Fix some more int overflow in softmax.

* Revert back to int64_t.