Releases: ggerganov/llama.cpp
Releases ยท ggerganov/llama.cpp
b2749
Reset schedule earlier to allow overlap with ggml graph computation oโฆ
b2748
quantize: add imatrix and dataset metadata in GGUF (#6658) * imatrix: save the dataset file used in the output file * llama: support kv overrides type string string * common: factorize KV Overrides parsing between common and server * quantize: add imatrix n entries and dataset KV metadata quantize: factorize KV Overrides parsing between common #6656 * llama: remove kv override str_value initialization as it does not compile on some toolchain * quantize: add imatrix m_last_call as `quantize.imatrix.chunks_count` * quantize: add imatrix filename in KV * llama: add llama_model_kv_override_free * common: add llama_model_kv_override_free common: free kv override if used after model loading * llama: finally move the string KV override value to the stack * llama : minor * no need to add a NUL to theย std::vector,ย std::stringย can be initialized from a pair of iterators. Co-authored-by: slaren <[email protected]> * kv override: ensure string termination --------- Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: slaren <[email protected]>
b2747
add basic tensor data validation function (#6884) * add basic tensor data validation function * add --check-tensors command line argument tensor validation is disabled by default and can be enabled by adding `--check-tensors` to the command line arguments. quantize always validates tensors.
b2746
gguf : fix mismatch between alloc and free functions (#6929)
b2740
ci: server: fix python installation (#6918)
b2737
llava : add support for moondream vision language model (#6899) * add support for moondream vision language model This required making the following changes to the CLIP model: 1. Support for patch embedding bias. 2. Make class embedding and pre-layernorm optional. 3. Add support for post-layernorm. * Update examples/llava/clip.cpp --------- Co-authored-by: Georgi Gerganov <[email protected]>
b2736
cmake : restore LLAMA_LLAMAFILE_DEFAULT
b2735
cmake : remove obsolete ANDROID check
b2734
llama : synchronize before get/set session data (#6911)
b2731
llama : check that all the tensor data is in the model file (#6885) * llama : check that all the tensor data is in the model file * also check for unsigned overflow