Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: llama3.1 rope_freqs not respecting custom head_dim #9141

Merged
merged 2 commits into from
Aug 27, 2024

Conversation

nyxkrage
Copy link
Contributor

@nyxkrage nyxkrage commented Aug 23, 2024

The Llama 3.1 pruned Minitron models utilize the llama3 rope_scaling and has a custom head_dim specified in the config.

This changes the conversion code to use the custom head_dim for llama models if it is specified, otherwise falling back to the old calculation.

n_rot seems to be present no matter the model, as it still works on both regular llama-3 and llama-3.1 with this change.

This fixes #9060

@github-actions github-actions bot added the python python script changes label Aug 23, 2024
ghost

This comment was marked as spam.

@kalomaze
Copy link
Contributor

Any ETA on merging this? It seems fully functional

@drummerv
Copy link

PTAL @ggerganov

@ggerganov ggerganov merged commit 75e1dbb into ggml-org:master Aug 27, 2024
51 of 54 checks passed
@drummerv
Copy link

ily @ggerganov

@drummerv
Copy link

@LostRuins CHOP CHOP

@LostRuins
Copy link
Collaborator

Confirmed working fine for me, tested with a fresh quant.

Note that all Llama-4b-minitron GGUF models converted prior to this PR should be reconverted.

dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
…org#9141)

* fix: llama3.1 rope_freqs not respecting custom head_dim

* fix: use potential head_dim for Exaone
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
…org#9141)

* fix: llama3.1 rope_freqs not respecting custom head_dim

* fix: use potential head_dim for Exaone
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
…org#9141)

* fix: llama3.1 rope_freqs not respecting custom head_dim

* fix: use potential head_dim for Exaone
Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Feb 25, 2025
…org#9141)

* fix: llama3.1 rope_freqs not respecting custom head_dim

* fix: use potential head_dim for Exaone
Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Feb 25, 2025
…org#9141)

* fix: llama3.1 rope_freqs not respecting custom head_dim

* fix: use potential head_dim for Exaone
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: support for nvidia/Llama-3.1-Minitron-4B-Width-Base
5 participants