Skip to content

Releases: ggerganov/llama.cpp

b2749

26 Apr 19:39
928e0b7
Compare
Choose a tag to compare
Reset schedule earlier to allow overlap with ggml graph computation oโ€ฆ

b2748

26 Apr 19:35
0c4d489
Compare
Choose a tag to compare
quantize: add imatrix and dataset metadata in GGUF (#6658)

* imatrix: save the dataset file used in the output file

* llama: support kv overrides type string string

* common: factorize KV Overrides parsing between common and server

* quantize: add imatrix n entries and dataset KV metadata
quantize: factorize KV Overrides parsing between common
#6656

* llama: remove kv override str_value initialization as it does not compile on some toolchain

* quantize: add imatrix m_last_call as `quantize.imatrix.chunks_count`

* quantize: add imatrix filename in KV

* llama: add llama_model_kv_override_free

* common: add llama_model_kv_override_free
common: free kv override if used after model loading

* llama: finally move the string KV override value to the stack

* llama : minor

* no need to add a NUL to theย std::vector,ย std::stringย can be initialized from a pair of iterators.

Co-authored-by: slaren <[email protected]>

* kv override: ensure string termination

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: slaren <[email protected]>

b2747

26 Apr 18:08
017e699
Compare
Choose a tag to compare
add basic tensor data validation function (#6884)

* add basic tensor data validation function

* add --check-tensors command line argument

tensor validation is disabled by default and can be enabled by adding
`--check-tensors` to the command line arguments.

quantize always validates tensors.

b2746

26 Apr 16:32
e2764cd
Compare
Choose a tag to compare
gguf : fix mismatch between alloc and free functions (#6929)

b2740

26 Apr 09:54
d4a9afc
Compare
Choose a tag to compare
ci: server: fix python installation (#6918)

b2737

25 Apr 23:12
46e12c4
Compare
Choose a tag to compare
llava : add support for moondream vision language model (#6899)

* add support for moondream vision language model

This required making the following changes to the CLIP model:

1. Support for patch embedding bias.
2. Make class embedding and pre-layernorm optional.
3. Add support for post-layernorm.

* Update examples/llava/clip.cpp

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b2736

25 Apr 22:21
dba497e
Compare
Choose a tag to compare
cmake : restore LLAMA_LLAMAFILE_DEFAULT

b2735

25 Apr 22:18
fa0b4ad
Compare
Choose a tag to compare
cmake : remove obsolete ANDROID check

b2734

25 Apr 21:03
d6e1d44
Compare
Choose a tag to compare
llama : synchronize before get/set session data (#6911)

b2731

25 Apr 17:48
0ead1f1
Compare
Choose a tag to compare
llama : check that all the tensor data is in the model file (#6885)

* llama : check that all the tensor data is in the model file

* also check for unsigned overflow