Version 3.0.0
Version 3.0 updates to the newest available version of llama.cpp and all its available features. The Java binding reworks almost all of the C++ code. It heavily relies on the llama.cpp server code, which theoretically should lead to much better performance, concurrency, and long-term maintainability.
The biggest change is how model and inference parameters are handled (see examples for details). Previous versions relied on properly typed Java classes, whereas the C++ server code mostly uses JSON. The JNI code to transfer the parameters from Java to C++ was complex and error-prone. The new version comes with almost no API changes regarding how parameters are handled (apart from the available parameters per se), but should be much easier to maintain in the long term.