Bug: BLOOM pre-tokenizer is missing #8741
Labels
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
What happened?
I tried to convert a BLOOM-based model (https://huggingface.co/TurkuNLP/gpt3-finnish-large) to GGUF. First, I had to change the architecture to
BloomForCausalLM
, and with that change I got the following error from the conversion script:I also tried to convert one of the original BLOOM models (560m), and got the same error (but with a different hash). It seems that BLOOM's pre-tokenizer was not added when they were dealt with in #6920. BLOOM is listed as a supported model in the README, so converting should work.
Name and Version
$ ./bin/llama-cli --version
version: 3481 (5e2727f)
built with cc (GCC) 14.1.1 20240720 for x86_64-pc-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output
No response
The text was updated successfully, but these errors were encountered: