Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Phi-2 model tokenizer not recognized #7667

Closed
saeid93 opened this issue May 31, 2024 · 5 comments
Closed

Bug: Phi-2 model tokenizer not recognized #7667

saeid93 opened this issue May 31, 2024 · 5 comments
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale

Comments

@saeid93
Copy link

saeid93 commented May 31, 2024

What happened?

Despite phi-2 being listed here as a supported model, trying to compile it using the following command:

python llama.cpp/convert-hf-to-gguf.py saved_model/ --outfile chi-2.gguf --outtype f16

Fails with the following error:

INFO:hf-to-gguf:Loading model: saved_model
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert-hf-to-gguf-update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert-hf-to-gguf-update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggerganov/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  fcace8b9cac38ce847670c970cd5892031a753a1ef381abd1d9af00f713da085
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 2856, in <module>
    main()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 2841, in main
    model_instance.set_vocab()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 116, in set_vocab
    self._set_vocab_gpt2()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 502, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 381, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 493, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

This is also the code for saving the phi-2 model:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.set_default_device("cuda")

model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2", trust_remote_code=True)

model.save_pretrained("./saved_model")
tokenizer.save_pretrained("./saved_model")

Name and Version

./main --version version: 3024 (2b737ca) built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

INFO:hf-to-gguf:Loading model: saved_model
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert-hf-to-gguf-update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert-hf-to-gguf-update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggerganov/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  fcace8b9cac38ce847670c970cd5892031a753a1ef381abd1d9af00f713da085
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 2856, in <module>
    main()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 2841, in main
    model_instance.set_vocab()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 116, in set_vocab
    self._set_vocab_gpt2()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 502, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 381, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
  File "/home/cc/polymorph/lab6/llama.cpp/convert-hf-to-gguf.py", line 493, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
@saeid93 saeid93 added bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels May 31, 2024
@nicolasperez19
Copy link
Contributor

Can you try saving the phi-2 model without saving the phi-2 tokenizer, like this:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.set_default_device("cuda")

model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype="auto", trust_remote_code=True)

model.save_pretrained("./saved_model")

Then try to compile it as you did last time.

@saeid93
Copy link
Author

saeid93 commented Jun 3, 2024

@nicolasperez19 Thank you for your response but how is the model supposed to work without a tokenizer?
I tried the code you mentioned and llamacpp is expecting the tokenizer to be there:

INFO:hf-to-gguf:Loading model: saved_model
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "/home/cc/polymorph/edge-setup/llama.cpp/convert-hf-to-gguf.py", line 2856, in <module>
    main()
  File "/home/cc/polymorph/edge-setup/llama.cpp/convert-hf-to-gguf.py", line 2841, in main
    model_instance.set_vocab()
  File "/home/cc/polymorph/edge-setup/llama.cpp/convert-hf-to-gguf.py", line 116, in set_vocab
    self._set_vocab_gpt2()
  File "/home/cc/polymorph/edge-setup/llama.cpp/convert-hf-to-gguf.py", line 502, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/home/cc/polymorph/edge-setup/llama.cpp/convert-hf-to-gguf.py", line 377, in get_vocab_base
    tokenizer = AutoTokenizer.from_pretrained(self.dir_model)
  File "/home/cc/miniconda3/envs/peftenv/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 855, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/cc/miniconda3/envs/peftenv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2070, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for 'saved_model'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'saved_model' is the correct path to a directory containing all relevant files for a CodeGenTokenizerFast tokenizer.

@github-actions github-actions bot added the stale label Jul 4, 2024
@Slaghton
Copy link

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert-hf-to-gguf-update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert-hf-to-gguf-update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: #6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: fcace8b9cac38ce847670c970cd5892031a753a1ef381abd1d9af00f713da085
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:
etc..

Just tried to convert a phi-2 model.safetensors file to gguf and get the same problem. Did you ever figure this out?

@github-actions github-actions bot removed the stale label Jul 13, 2024
@RhinoDevel
Copy link
Contributor

This PR should solve the problem: #8777

Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants