Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

tangledgroup / llama-cpp-cffi Public

Notifications You must be signed in to change notification settings
Fork 0
Star 2

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: tangledgroup/llama-cpp-cffi

Releases · tangledgroup/llama-cpp-cffi

v0.4.18

30 Jan 18:57

mtasic85

Compare

Choose a tag to compare

Loading

v0.4.18 Latest

Latest

Changed:

llama.cpp revision eb7cf15a808d4d7a71eef89cc6a9b96fe82989dc

Assets 11

Loading

All reactions

v0.4.17

23 Jan 13:55

mtasic85

Compare

Choose a tag to compare

Loading

v0.4.17

Changed:

llama.cpp revision 6152129d05870cb38162c422c6ba80434e021e9f

Fixed:

Fixed build process, json patches.
Reverted server code to previous version due to bug.

Assets 11

Loading

All reactions

v0.3.1

02 Jan 11:41

mtasic85

Compare

Choose a tag to compare

Loading

v0.3.1

Added:

llama-cpp-cffi server - support for dynamic load/unload of model - hot-swap of models on demand
llama-cpp-cffi server - compatible with llama.cpp cli options
llama-cpp-cffi server - limited compatibility for OpenAI API /v1/chat/completions for text and vision models
Support for CompletionsOptions.messages for VLM prompts with a single message containing just a pair of text and image_url in content.

Changed:

llama.cpp revision 0827b2c1da299805288abbd556d869318f2b121e

Assets 20

Loading

All reactions

v0.3.0

01 Jan 10:49

mtasic85

Compare

Choose a tag to compare

Loading

v0.3.0

Added:

Qwen 2 VL 2B / 7B vision models support
WIP llama-cpp-cffi server - compatible with llama.cpp cli options instead of OpenAI

Changed:

llama.cpp revision 5896c65232c7dc87d78426956b16f63fbf58dcf6
Refactored Options class into two separate classes: ModelOptions, CompletionsOptions

Fixed:

Llava (moondream2, nanoLLaVA-1.5, llava-v1.6-mistral-7b) vision models support
MiniCPM-V 2.5 / 2.6 vision models support

Removed:

Removed ambiguous Options class

Assets 20

Loading

All reactions

v0.2.0

11 Dec 10:58

mtasic85

Compare

Choose a tag to compare

Loading

v0.2.0

Added:

New high-level Python API
Low-level C API calls from llama.h, llava.h, clip.h, ggml.h
completions for high-level function for LLMs / VLMs
text_completions for low-level function for LLMs
clip_completions for low-level function for CLIP-based VLMs
WIP: mllama_completions for low-level function for Mllama-based VLMs

Changed:

All examples

Removed:

llama_generate function
llama_cpp_cli
llava_cpp_cli
minicpmv_cpp_cli

Assets 4

Loading

All reactions

v0.1.22

27 Nov 08:27

mtasic85

Compare

Choose a tag to compare

Loading

v0.1.22

Added:

llava high-level API calls
minicpmv high-level API support

Assets 20

Loading

All reactions

v0.1.16

02 Sep 06:38

mtasic85

Compare

Choose a tag to compare

Loading

v0.1.16

Added:
- Updated llama.cpp.

Assets 20

Loading

All reactions

v0.1.15

20 Aug 06:56

mtasic85

Compare

Choose a tag to compare

Loading

v0.1.15

Added:
- SmolLM-1.7B-Instruct-v0.2 examples.
- Updated llama.cpp.

Assets 20

Loading

All reactions

v0.1.14

17 Aug 06:50

mtasic85

Compare

Choose a tag to compare

Loading

v0.1.14

Fixed:
- Vulkan detection.

Assets 20

Loading

All reactions

0.1.13

16 Aug 20:05

mtasic85

Compare

Choose a tag to compare

Loading

0.1.13

Fixed:
- CUDA and Vulkan detection.

Assets 20

Loading

All reactions

Previous 1 2 Next

Previous Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.