Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemma models quantized using llamacpp not working in lm studio #5706

Closed
rombodawg opened this issue Feb 25, 2024 · 8 comments
Closed

Gemma models quantized using llamacpp not working in lm studio #5706

rombodawg opened this issue Feb 25, 2024 · 8 comments

Comments

@rombodawg
Copy link

rombodawg commented Feb 25, 2024

Gemma models that have been quantized using Llamacpp are not working. Please look into the issue

error

"llama.cpp error: 'create_tensor: tensor 'output.weight' not found'"

I will open a issue on the lm studio github aswell addressing this

lmstudio-ai/configs#21

System:
Ryzen 5600x
rtx 3080 gpu
b550 motherboard
64gb ddr4 ram
windows 10 OS

@Yefori-Go
Copy link

image
Perhaps you should try using the latest llama.cpp to convert gemma model.

python .\convert-hf-to-gguf.py models\gemma-2b-it\ --outfile gemma-2b-it-f16.gguf

@JohannesGaessler
Copy link
Collaborator

I can only speak for myself but I 100% refuse to debug a problem unless it can be reproduced entirely with open-source code.

@rombodawg
Copy link
Author

rombodawg commented Feb 26, 2024

No it literally doesnt work

I just built this version of llamacpp. And that .py script doesnt work for Gemma

Plus that not the script you are even suppose to use according to the documentation. You are suppose to use convert.py

NotImplementedError: Architecture "GemmaForCausalLM" not supported!
E:\Open_source_ai_chatbot\Llamacpp-3\llama.cpp>python E:\Open_source_ai_chatbot\Llamacpp-mixtral\llamacpp-clone-mixtral\convert-hf-to-gguf.py E:\Open_source_ai_chatbot\OOBA_10\text-generation-webui-main\models\Gemma-EveryoneLLM-7b-test --outfile Gemma-EveryoneLLM-7b-test.gguf --outtype f16
Loading model: Gemma-EveryoneLLM-7b-test
Traceback (most recent call last):
  File "E:\Open_source_ai_chatbot\Llamacpp-mixtral\llamacpp-clone-mixtral\convert-hf-to-gguf.py", line 1033, in <module>    model_instance = model_class(dir_model, ftype_map[args.outtype], fname_out, args.bigendian)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Open_source_ai_chatbot\Llamacpp-mixtral\llamacpp-clone-mixtral\convert-hf-to-gguf.py", line 48, in __init__
    self.model_arch = self._get_model_architecture()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Open_source_ai_chatbot\Llamacpp-mixtral\llamacpp-clone-mixtral\convert-hf-to-gguf.py", line 225, in _get_model_architecture
    raise NotImplementedError(f'Architecture "{arch}" not supported!')
NotImplementedError: Architecture "GemmaForCausalLM" not supported!

@rombodawg
Copy link
Author

@JohannesGaessler I totally understand that, hopefully openai is willing to reach out and work with you to fix this

@rombodawg
Copy link
Author

rombodawg commented Feb 26, 2024

Im uploading the model files for the merges if anyone wants to do some debugging. Should be in the next 10 hours or so. Sorry slow internet.

Follow the mulit-thread. And check out my model for debugging.

Thread links:
lmstudio-ai/configs#21
#5706
arcee-ai/mergekit#181
oobabooga/text-generation-webui#5562

https://huggingface.co/rombodawg/Gemme-Merge-Test-7b

@hiepxanh
Copy link

hiepxanh commented Mar 15, 2024

@rombodawg did you try latest version?
#6051

it already support

@github-actions github-actions bot added the stale label Apr 15, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@DementedWeasel1971
Copy link

I think like me people are considering other options. I will however keep on watching the release notes to see when this is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants