WIP for adding support for Tekken tokenizer needed for Mistral NeMo #8578
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Attempting to add support for Mistral NeMo (#8577), but I've never added support for a new model before, so this is heavily a WIP. I need to take a break for a while, so uploading my notes here in case it's useful for anyone else.
While the model architecture may be a drop-in replacement for Mistral 7B, the tokenizer is not (yet) added to our list of supported BPE tokenizers. Attempting to quantize Mistra-NeMo via GGUF-my-repo results in:
I have not yet expanded the tests to include the new tokenizer.
I haven't figured out any other settings or options that may need to be set for this tokenizer.
I haven't looked into the regex used by
llm_tokenizer_bpe
to see if it needs to be changed from the default or not.Basically it's drastically untested, and I would have liked to get this further before uploading a WIP.