Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Microsoft Phi-4 model #10817

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

fairydreaming
Copy link
Collaborator

@fairydreaming fairydreaming commented Dec 13, 2024

This PR adds support for Microsoft Phi-4 model. Fixes #10814.

Current solution is to:

  • Use tokenizer_class value from tokenizer_config.json as a condition to use GPT2 vocab during model conversion.
  • Store explicit 0 value of sliding_window hparam if it's null. This allows the old Phi-3 n_swa validation logic to work without any changes. If n_swa is 0 a regular KQ mask is used instead of sliding window KQ mask in build_phi3().

A model name value from general.name ("Phi 4") was used to trigger behavior specific to Phi-4 model:

1. Using GPT2 vocab during model conversion
2. Ignoring sliding_window hparam during model conversion
3. Skipping sliding window length value check (n_swa == 0) in build_phi3()
4. Creating regular KQ mask instead of sliding window KQ mask in build_phi3()

Let me know if there is any better way to differentiate Phi 4 from other models based on PHI3 architecture.

…4 model

llama : use regular (not a sliding window) attention mask for Phi-4 model
@github-actions github-actions bot added the python python script changes label Dec 13, 2024
src/llama.cpp Outdated
@@ -12839,7 +12839,13 @@ struct llm_build_context {
struct ggml_tensor * inp_pos = build_inp_pos();

// KQ_mask (mask for 1 head, it will be broadcasted to all heads)
struct ggml_tensor * KQ_mask_swa = build_inp_KQ_mask_swa();
struct ggml_tensor * KQ_mask = nullptr;
if (model.name == "Phi 4") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better solution would be to check if hparams.n_swa != 0.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified my patch to explicitly store zero sliding_window in case it's null in config.json and use the zero value to distinguish Phi-4 from other PHI3-based models.

convert_hf_to_gguf.py Outdated Show resolved Hide resolved
Comment on lines 2132 to 2133
if self.metadata.name == "Phi 4":
return self._set_vocab_gpt2()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, self._set_vocab_gpt2() could be called when tokenizer.model is missing here, regardless of the model name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the solution to check value of tokenizer_class from tokenizer_config.json and call self._set_vocab_gpt2() if it's GPT2Tokenizer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Add support for Phi-4 model
4 participants