Bug: WARNING: The BPE pre-tokenizer was not recognized! #9927
Labels
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
stale
What happened?
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: #6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: 8e62295832751ca1e8f92f2226f403dea30dc5165e448b5bfa05af5340c64ec7
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:
Traceback (most recent call last):
File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 4430, in
main()
File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 4424, in main
model_instance.write()
File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 434, in write
self.prepare_metadata(vocab_only=False)
File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 427, in prepare_metadata
self.set_vocab()
File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 2554, in set_vocab
tokens, toktypes, tokpre = self.get_vocab_base()
File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 515, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 671, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
Name and Version
python convert_hf_to_gguf.py /data/model/BAAI/bge-large-zh-v1.5/ --outfile text2vec-base-chinese.gguf --model-name bert-bge
What operating system are you seeing the problem on?
No response
Relevant log output
No response
The text was updated successfully, but these errors were encountered: