Releases · ggerganov/llama.cpp

26 Apr 19:39

928e0b7

b2749

Reset schedule earlier to allow overlap with ggml graph computation o…

Assets 19

26 Apr 19:35

github-actions

b2748

0c4d489

b2748

quantize: add imatrix and dataset metadata in GGUF (#6658)

* imatrix: save the dataset file used in the output file

* llama: support kv overrides type string string

* common: factorize KV Overrides parsing between common and server

* quantize: add imatrix n entries and dataset KV metadata
quantize: factorize KV Overrides parsing between common
#6656

* llama: remove kv override str_value initialization as it does not compile on some toolchain

* quantize: add imatrix m_last_call as `quantize.imatrix.chunks_count`

* quantize: add imatrix filename in KV

* llama: add llama_model_kv_override_free

* common: add llama_model_kv_override_free
common: free kv override if used after model loading

* llama: finally move the string KV override value to the stack

* llama : minor

* no need to add a NUL to the std::vector, std::string can be initialized from a pair of iterators.

Co-authored-by: slaren <[email protected]>

* kv override: ensure string termination

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: slaren <[email protected]>

Assets 19

26 Apr 18:08

github-actions

b2747

017e699

b2747

add basic tensor data validation function (#6884)

* add basic tensor data validation function

* add --check-tensors command line argument

tensor validation is disabled by default and can be enabled by adding
`--check-tensors` to the command line arguments.

quantize always validates tensors.

Assets 19

26 Apr 16:32

github-actions

b2746

e2764cd

b2746

gguf : fix mismatch between alloc and free functions (#6929)

Assets 19

26 Apr 09:54

github-actions

b2740

d4a9afc

b2740

ci: server: fix python installation (#6918)

Assets 3

25 Apr 23:12

github-actions

b2737

46e12c4

b2737

llava : add support for moondream vision language model (#6899)

* add support for moondream vision language model

This required making the following changes to the CLIP model:

1. Support for patch embedding bias.
2. Make class embedding and pre-layernorm optional.
3. Add support for post-layernorm.

* Update examples/llava/clip.cpp

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 19

25 Apr 22:21

github-actions

b2736

dba497e

b2736

cmake : restore LLAMA_LLAMAFILE_DEFAULT

Assets 19

25 Apr 22:18

github-actions

b2735

fa0b4ad

b2735

cmake : remove obsolete ANDROID check

Assets 19

25 Apr 21:03

github-actions

b2734

d6e1d44

b2734

llama : synchronize before get/set session data (#6911)

Assets 19

25 Apr 17:48

github-actions

b2731

0ead1f1

b2731

llama : check that all the tensor data is in the model file (#6885)

* llama : check that all the tensor data is in the model file

* also check for unsigned overflow

Assets 19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b2749

b2748

b2747

b2746

b2740

b2737

b2736

b2735

b2734

b2731