convert : refactor rope_freqs generation #9396

compilade · 2024-09-10T00:59:15Z

This isolates handling of generated tensors like rope_freqs for Llama3, Phi-3 and Exaone. This should also fix --vocab-only conversion for Phi-3-128k and Phi-3.5 (which previously generated invalid GGUF files because they included a non-zero tensor count while not including any tensor data).

Note that this will also be relevant for MiniCPM3 (#9322), which re-uses the misbehaving Phi-3 rope tensors insertion.

TODO

I have read the contributing guidelines
Self-reported review complexity:
- Low

This should also fix vocab-only conversion for Phi-3.

ngxson

LGTM. Thanks for the implementation!

ngxson · 2024-09-10T07:56:39Z

Btw, you can use scripts/test-lora-conversion-inference.sh. It will perform end-to-end conversion-inference test for lora gemma2, phi3 and llama arch (note: these are toy models, so they don't have rope_freqs)

ThiloteE · 2024-09-10T11:17:45Z

Related to issue Support for Phi-3 models #6849

ggerganov · 2024-09-16T06:49:37Z

Should we merge this, or wait for the rest of the tests in OP to be confirmed?

ngxson · 2024-09-16T07:08:50Z

I ran the test locally and can confirm that it passes. Let's wait for final confirmation from @compilade to merge this.

compilade · 2024-09-16T11:17:31Z

Should we merge this, or wait for the rest of the tests in OP to be confirmed?

Since #9322 was merged, MiniCPM3's conversion also has to be updated before merging this. I'll update it today.

MiniCPM3's tokenizer is treated as a SentencePiece tokenizer to avoid having to run its custom Python code which mixes tokenization in the same file as tool calls. gguf-py : add long and short RoPE factors to tensor mappings Empty, but the key names are used to populate the mappings.

bioinformatist · 2024-09-18T07:25:06Z

@compilade Hey bro, when I try to convert Minicpm3 to .gguf format, it says:

The repository for /models contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//models.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Will it be implemented by this PR? Thanks!

compilade · 2024-09-18T10:39:34Z

@compilade Hey bro, when I try to convert Minicpm3 to .gguf format, it says:
The repository for /models contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//models.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.
Will it be implemented by this PR? Thanks!

@bioinformatist

Yes, actually, in e83d270 I've changed how MiniCPM3's tokenizer is loaded to exactly avoid that custom code. It uses SentencePiece directly instead. I think it results in the same model files, but I didn't test that yet because I can't really run the custom tokenization code since it repends on datamodel_code_generator which is not in Nixpkgs.

That was a single line change in set_vocab.

bioinformatist · 2024-09-18T14:04:39Z

Got that, as we have been turning to NixOS (especially for production use) these days. Hope everything goes well! ❤️

* convert : refactor rope_freqs generation This should also fix vocab-only conversion for Phi-3. * convert : adapt MiniCPM3 to separate rope_freqs insertion MiniCPM3's tokenizer is treated as a SentencePiece tokenizer to avoid having to run its custom Python code which mixes tokenization in the same file as tool calls. gguf-py : add long and short RoPE factors to tensor mappings Empty, but the key names are used to populate the mappings.

convert : refactor rope_freqs generation

141dd55

This should also fix vocab-only conversion for Phi-3.

compilade added refactoring Refactoring bugfix fixes an issue or bug python python script changes labels Sep 10, 2024

compilade requested a review from ngxson September 10, 2024 00:59

compilade added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Sep 10, 2024

ngxson approved these changes Sep 10, 2024

View reviewed changes

ThiloteE mentioned this pull request Sep 10, 2024

Bug: phi 3.5 mini produces garbage past 4096 context #9127

Closed

ggerganov approved these changes Sep 12, 2024

View reviewed changes

bioinformatist mentioned this pull request Sep 12, 2024

Support MiniCPM3. #9322

Merged

4 tasks

compilade added 2 commits September 16, 2024 12:01

Merge branch 'master' into compilade/convert-separate-extra-tensors

ed0f2c4

bioinformatist mentioned this pull request Sep 20, 2024

[Feature Request]: Need LoRA model in .gguf format OpenBMB/MiniCPM#243

Open

Merge branch 'master' into compilade/convert-separate-extra-tensors

bfbac0e

compilade added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label Sep 30, 2024

ggerganov merged commit 1927378 into ggerganov:master Oct 1, 2024
9 checks passed

ThiloteE mentioned this pull request Oct 1, 2024

Problem with phi-3.5-mini-instruct chat template and endless generation nomic-ai/gpt4all#2930

Open

compilade mentioned this pull request Oct 6, 2024

add solar pro support #9541

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert : refactor rope_freqs generation #9396

convert : refactor rope_freqs generation #9396

compilade commented Sep 10, 2024 •

edited

Loading

ngxson left a comment

ngxson commented Sep 10, 2024 •

edited

Loading

ThiloteE commented Sep 10, 2024

ggerganov commented Sep 16, 2024

ngxson commented Sep 16, 2024

compilade commented Sep 16, 2024 •

edited

Loading

bioinformatist commented Sep 18, 2024 •

edited

Loading

compilade commented Sep 18, 2024

bioinformatist commented Sep 18, 2024

convert : refactor rope_freqs generation #9396

convert : refactor rope_freqs generation #9396

Conversation

compilade commented Sep 10, 2024 • edited Loading

TODO

ngxson left a comment

Choose a reason for hiding this comment

ngxson commented Sep 10, 2024 • edited Loading

ThiloteE commented Sep 10, 2024

ggerganov commented Sep 16, 2024

ngxson commented Sep 16, 2024

compilade commented Sep 16, 2024 • edited Loading

bioinformatist commented Sep 18, 2024 • edited Loading

compilade commented Sep 18, 2024

bioinformatist commented Sep 18, 2024

compilade commented Sep 10, 2024 •

edited

Loading

ngxson commented Sep 10, 2024 •

edited

Loading

compilade commented Sep 16, 2024 •

edited

Loading

bioinformatist commented Sep 18, 2024 •

edited

Loading