Releases: cpumaxx/llama.cpp
Releases · cpumaxx/llama.cpp
b2794
b2792
readme : add note that LLaMA 3 is not supported with convert.py (#7065)
b2755
Fix more int overflow during quant (PPL/CUDA). (#6563) * Fix more int overflow during quant. * Fix some more int overflow in softmax. * Revert back to int64_t.