-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add qwen2-0.5B-instruct support. #35
Comments
I've explored this model, and I hope to add it at some point. It isn't currently supported in the ctranslate2 backend that we use for inference. If/when it is supported there it shouldn't be difficult to add here. |
Umm, small question. Now that llama-cpp supports flan-t5, would you consider to change from ctranslate2 to it? it would allow a broader model and quantization support (making it easier to mantain as you dont have to convert your own models). |
I'm open to that. At the moment I'm not aware of well-maintained Python bindings that support batched inference for llama-cpp. I would prefer not to lose that performance benefit. There is work being done in on this in llama-cpp-python. |
@lavilao There's still no Qwen2 (or 2.5) support, but I did recently update the package to support the following instruct models:
|
Awesome, i wonder if llama 3.2 1b Will run on My potato. |
Its a really good model for its size and it aligns with the goal of this project.
The text was updated successfully, but these errors were encountered: