Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ramalama does not work with granite models #338

Closed
vpavlin opened this issue Oct 21, 2024 · 4 comments
Closed

Ramalama does not work with granite models #338

vpavlin opened this issue Oct 21, 2024 · 4 comments

Comments

@vpavlin
Copy link

vpavlin commented Oct 21, 2024

Ollama announced support for IBM Granite https://x.com/ollama/status/1848223852465213703

I tried to run granite3-moe with ramalama

[vpavlin@vpavlin-tuxedo ~/devel/github.com/vpavlin/ramalama(main) ]
$ ramalama run granite3-moe
Pulling dfc8e4074962e215: 100% ▕####################▏ 1.92G/1.92G 3.23MB/s 00:00
[vpavlin@vpavlin-tuxedo ~/devel/github.com/vpavlin/ramalama(main) ]
$ ramalama run granite3-moe

But it fails after download without printing any error log. Latest Ollama works fine with this model

OS: Ubuntu 23.10
Python: 3.11.6
Ramalama:


$ ramalama info 
{
    "Engine": "podman",
    "Image": "quay.io/ramalama/ramalama:latest",
    "Runtime": "llama.cpp",
    "Store": "/home/vpavlin/.local/share/ramalama",
    "Version": 0
}
@ericcurtin
Copy link
Collaborator

Fix:

#340

This is specifically a problem with granitemoe models

@vpavlin
Copy link
Author

vpavlin commented Oct 21, 2024

FYI not just moe models

$ ramalama run granite3-dense
Pulling 629c1de9fdd794ce: 100% ▕####################▏ 1.49G/1.49G 5.51MB/s 00:00
[vpavlin@vpavlin-tuxedo ~/devel/github.com/vpavlin/ramalama(main) ]
$ ramalama run granite3-dense

@ericcurtin
Copy link
Collaborator

It's likely the same fix, update llama.cpp , it's amazing how quickly upstream llama.cpp project moves, if you add this patch at least for the granite-moe case you get this:

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'granitemoe'

because this model architecture was added recently

diff --git a/ramalama/model.py b/ramalama/model.py
index a12cc1e..f9b3fd3 100644
--- a/ramalama/model.py
+++ b/ramalama/model.py
@@ -120,7 +120,7 @@ class Model:
             exec_args.append("-cnv")

         try:
-            exec_cmd(exec_args, False)
+            exec_cmd(exec_args, True)
         except FileNotFoundError as e:
             if in_container():
                 raise NotImplementedError(file_not_found_in_container % (exec_args[0], str(e).strip("'")))

@rhatdan
Copy link
Member

rhatdan commented Oct 21, 2024

This should be fixed with release v0.0.20

@rhatdan rhatdan closed this as completed Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants