Releases: kherud/java-llama.cpp
Version 3.4.1
Version 3.4.0
Version 3.4.0
Credit goes to @shuttie for adding CUDA support on Linux x86_64 with this version.
Version 3.3.0
Upgrade to latest llama.cpp version b3534
Version 3.2.1
- Include GGML backend in text log
- Update to llama.cpp b3008
Version 3.2.0
Logging Re-Implementation (see #66)
- Re-adds logging callbacks via
LlamaModel#setLogger(LogFormat, BiConsumer<LogLevel, String>)
- Removes dis-functional
ModelParameters#setLogDirectory(String)
,ModelParameters#setDisableLog(boolean)
, andModelParameters#setLogFormat(LogFormat)
Version 3.1.1
Version 3.1.0
Changes:
- Updates to llama.cpp b2885
- Fixes #62 (generation can now be canceled)
- Fixes macos x64 shared libraries
API changes:
LlamaModel.Output
is nowLlamaOutput
LlamaIterator
is now public, was privateLlamaModel.Iterator
previously
Version 3.0.2
Upgrade to llama.cpp b2797
- Adds explicit support for Phi-3
- Adds flash attention
- Fixes #54
Version 3.0.1
- Updated the binding to llama.cpp b702 to add llama 3 support
- Fix #54 by using codellama for testing
Version 3.0.0
Version 3.0 updates to the newest available version of llama.cpp and all its available features. The Java binding reworks almost all of the C++ code. It heavily relies on the llama.cpp server code, which theoretically should lead to much better performance, concurrency, and long-term maintainability.
The biggest change is how model and inference parameters are handled (see examples for details). Previous versions relied on properly typed Java classes, whereas the C++ server code mostly uses JSON. The JNI code to transfer the parameters from Java to C++ was complex and error-prone. The new version comes with almost no API changes regarding how parameters are handled (apart from the available parameters per se), but should be much easier to maintain in the long term.