You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when running convert.py or convert-hf-to-gguf.py to convert qwenn1.5 models error related to bpe tokenizers are occuring. even after adding the model in the convert-hf-to-gguf-update.py still the same error.
Loading model: qwenm11
gguf: This GGUF file is for Little Endian only
Set model parameters
gguf: context length = 32768
gguf: embedding length = 1024
gguf: feed forward length = 2816
gguf: head count = 16
gguf: key-value head count = 16
gguf: rope theta = 1000000.0
gguf: rms norm epsilon = 1e-06
gguf: file type = 1
Set model tokenizer
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
chktok: [198, 4710, 14731, 65497, 7847, 1572, 2303, 78672, 10947, 145836, 320, 8252, 8, 26525, 114, 378, 235, 149921, 30543, 320, 35673, 99066, 97534, 8, 25521, 227, 11162, 99, 247, 149955, 220, 18, 220, 18, 18, 220, 18, 18, 18, 220, 18, 18, 18, 18, 220, 18, 18, 18, 18, 18, 220, 18, 18, 18, 18, 18, 18, 220, 18, 18, 18, 18, 18, 18, 18, 220, 18, 18, 18, 18, 18, 18, 18, 18, 220, 18, 13, 18, 220, 18, 496, 18, 220, 18, 1112, 18, 220, 146394, 97529, 241, 44258, 233, 146568, 44258, 224, 147603, 20879, 115, 146280, 44258, 223, 146280, 147272, 97529, 227, 144534, 937, 104100, 18493, 22377, 99257, 16, 18, 16, 19, 16, 20, 16, 35727, 21216, 55460, 53237, 18658, 14144, 1456, 13073, 63471, 33594, 3038, 133178, 79012, 3355, 4605, 4605, 13874, 13874, 73594, 3014, 3014, 28149, 17085, 2928, 26610, 7646, 358, 3003, 1012, 364, 83, 813, 566, 594, 1052, 11, 364, 787, 498, 2704, 30, 364, 44, 537, 2704, 358, 3278, 1281, 432, 11, 364, 35, 498, 1075, 1045, 15243, 30, 1205, 6, 42612, 264, 63866, 43]
chkhsh: e636dc30a262dcc0d8c323492e32ae2b70728f4df7dfe9737d9f920a282b8aea
**************************************************************************************
** WARNING: The BPE pre-tokenizer was not recognized!
** There are 2 possible reasons for this:
** - the model has not been added to convert-hf-to-gguf-update.py yet
** - the pre-tokenization config has changed upstream
** Check your model files and convert-hf-to-gguf-update.py and update them accordingly.
** ref: https://github.com/ggerganov/llama.cpp/pull/6920
**
** chkhsh: e636dc30a262dcc0d8c323492e32ae2b70728f4df7dfe9737d9f920a282b8aea
**************************************************************************************
Traceback (most recent call last):
File "/content/llama.cpp/convert-hf-to-gguf.py", line 1886, in set_vocab
self._set_vocab_sentencepiece()
File "/content/llama.cpp/convert-hf-to-gguf.py", line 404, in _set_vocab_sentencepiece
raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /content/qwenm11/tokenizer.model
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/llama.cpp/convert-hf-to-gguf.py", line 3001, in <module>
main()
File "/content/llama.cpp/convert-hf-to-gguf.py", line 2988, in main
model_instance.set_vocab()
File "/content/llama.cpp/convert-hf-to-gguf.py", line 1888, in set_vocab
self._set_vocab_gpt2()
File "/content/llama.cpp/convert-hf-to-gguf.py", line 331, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
File "/content/llama.cpp/convert-hf-to-gguf.py", line 242, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File "/content/llama.cpp/convert-hf-to-gguf.py", line 323, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
The text was updated successfully, but these errors were encountered:
Another BPE model that is going to need some love after the L3 conversion changes. See #7030 for the same problem with command-R and note the pending PRs that will (hopefully) resolve it. I think every BPE model is going to end up needing similar treatment.
when running convert.py or convert-hf-to-gguf.py to convert qwenn1.5 models error related to bpe tokenizers are occuring. even after adding the model in the convert-hf-to-gguf-update.py still the same error.
The text was updated successfully, but these errors were encountered: