fix: llama3.1 rope_freqs not respecting custom head_dim #9141

nyxkrage · 2024-08-23T05:13:39Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

The Llama 3.1 pruned Minitron models utilize the llama3 rope_scaling and has a custom head_dim specified in the config.

This changes the conversion code to use the custom head_dim for llama models if it is specified, otherwise falling back to the old calculation.

n_rot seems to be present no matter the model, as it still works on both regular llama-3 and llama-3.1 with this change.

This fixes #9060

src/llama.cpp

kalomaze · 2024-08-27T03:30:15Z

Any ETA on merging this? It seems fully functional

drummerv · 2024-08-27T06:12:01Z

PTAL @ggerganov

drummerv · 2024-08-27T06:58:52Z

ily @ggerganov

drummerv · 2024-08-27T07:02:03Z

@LostRuins CHOP CHOP

LostRuins · 2024-08-27T10:18:59Z

Confirmed working fine for me, tested with a fresh quant.

Note that all Llama-4b-minitron GGUF models converted prior to this PR should be reconverted.

…org#9141) * fix: llama3.1 rope_freqs not respecting custom head_dim * fix: use potential head_dim for Exaone

fix: llama3.1 rope_freqs not respecting custom head_dim

b77d7f6

github-actions bot added the python python script changes label Aug 23, 2024

This comment was marked as spam.

Sign in to view

ggerganov approved these changes Aug 23, 2024

View reviewed changes

src/llama.cpp Show resolved Hide resolved

fix: use potential head_dim for Exaone

1a88919

ThomasBaruzier mentioned this pull request Aug 25, 2024

Feature Request: support for nvidia/Llama-3.1-Minitron-4B-Width-Base #9060

Closed

4 tasks

ggerganov merged commit 75e1dbb into ggml-org:master Aug 27, 2024
51 of 54 checks passed

dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024

llama : fix llama3.1 rope_freqs not respecting custom head_dim (ggml-…

1e0c15e

…org#9141) * fix: llama3.1 rope_freqs not respecting custom head_dim * fix: use potential head_dim for Exaone

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

llama : fix llama3.1 rope_freqs not respecting custom head_dim (ggml-…

dbaf7e6

…org#9141) * fix: llama3.1 rope_freqs not respecting custom head_dim * fix: use potential head_dim for Exaone

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

llama : fix llama3.1 rope_freqs not respecting custom head_dim (ggml-…

b3a68aa

…org#9141) * fix: llama3.1 rope_freqs not respecting custom head_dim * fix: use potential head_dim for Exaone

Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Feb 25, 2025

llama : fix llama3.1 rope_freqs not respecting custom head_dim (ggml-…

bc8699d

…org#9141) * fix: llama3.1 rope_freqs not respecting custom head_dim * fix: use potential head_dim for Exaone

Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Feb 25, 2025

llama : fix llama3.1 rope_freqs not respecting custom head_dim (ggml-…

e4286c9

…org#9141) * fix: llama3.1 rope_freqs not respecting custom head_dim * fix: use potential head_dim for Exaone

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: llama3.1 rope_freqs not respecting custom head_dim #9141

fix: llama3.1 rope_freqs not respecting custom head_dim #9141

nyxkrage commented Aug 23, 2024 •

edited by LostRuins

Loading

This comment was marked as spam.

kalomaze commented Aug 27, 2024

drummerv commented Aug 27, 2024

drummerv commented Aug 27, 2024

drummerv commented Aug 27, 2024

LostRuins commented Aug 27, 2024

fix: llama3.1 rope_freqs not respecting custom head_dim #9141

fix: llama3.1 rope_freqs not respecting custom head_dim #9141

Conversation

nyxkrage commented Aug 23, 2024 • edited by LostRuins Loading

This comment was marked as spam.

kalomaze commented Aug 27, 2024

drummerv commented Aug 27, 2024

drummerv commented Aug 27, 2024

drummerv commented Aug 27, 2024

LostRuins commented Aug 27, 2024

nyxkrage commented Aug 23, 2024 •

edited by LostRuins

Loading