Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed n vocab #9511

Merged
merged 4 commits into from
Sep 17, 2024
Merged

Fixed n vocab #9511

merged 4 commits into from
Sep 17, 2024

Conversation

Xarbirus
Copy link
Contributor

For no_vocab models, the current llm_load_vocab function incorrectly fills n_vocab, which is why an

llama_decode_internal: invalid token[0] = ...

error occurs during inference.

src/llama.cpp Outdated Show resolved Hide resolved
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, we currently have state duplication in the sense that hparams.n_vocab and vocab.n_vocab are the same thing but initialized independently in different ways. The reason to have the latter is because some parts of the implementation are decoupled from the model/hparams.

Just noting this here for awareness - no action is necessary for this PR. Will be resolved in future refactorings.

src/llama.cpp Outdated Show resolved Hide resolved
@ggerganov ggerganov merged commit 8344ef5 into ggerganov:master Sep 17, 2024
50 of 51 checks passed
@Xarbirus Xarbirus deleted the fixed_n_vocab branch September 23, 2024 16:54
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* llama: fixed n_vocab for `no_vocab` models

* llama: updated error output for `llama_decode_internal` and `llama_encode_internal`

* llama: log warning if there's no vocab_size in metadata

* llama: correct vocab size for logging

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* llama: fixed n_vocab for `no_vocab` models

* llama: updated error output for `llama_decode_internal` and `llama_encode_internal`

* llama: log warning if there's no vocab_size in metadata

* llama: correct vocab size for logging

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* llama: fixed n_vocab for `no_vocab` models

* llama: updated error output for `llama_decode_internal` and `llama_encode_internal`

* llama: log warning if there's no vocab_size in metadata

* llama: correct vocab size for logging

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants