Add support for Microsoft Phi-4 model #10817

fairydreaming · 2024-12-13T19:03:22Z

This PR adds support for Microsoft Phi-4 model. Fixes #10814.

Current solution is to:

Use tokenizer_class value from tokenizer_config.json as a condition to use GPT2 vocab during model conversion.
Store explicit 0 value of sliding_window hparam if it's null. This allows the old Phi-3 n_swa validation logic to work without any changes. If n_swa is 0 a regular KQ mask is used instead of sliding window KQ mask in build_phi3().

~~A model name value from general.name ("Phi 4") was used to trigger behavior specific to Phi-4 model:~~

~~1. Using GPT2 vocab during model conversion~~
~~2. Ignoring sliding_window hparam during model conversion~~
~~3. Skipping sliding window length value check (n_swa == 0) in build_phi3()~~
~~4. Creating regular KQ mask instead of sliding window KQ mask in build_phi3()~~

~~Let me know if there is any better way to differentiate Phi 4 from other models based on PHI3 architecture.~~

…4 model llama : use regular (not a sliding window) attention mask for Phi-4 model

slaren · 2024-12-14T02:30:35Z

src/llama.cpp

@@ -12839,7 +12839,13 @@ struct llm_build_context {
        struct ggml_tensor * inp_pos = build_inp_pos();

        // KQ_mask (mask for 1 head, it will be broadcasted to all heads)
-        struct ggml_tensor * KQ_mask_swa = build_inp_KQ_mask_swa();
+        struct ggml_tensor * KQ_mask = nullptr;
+        if (model.name == "Phi 4") {


I think a better solution would be to check if hparams.n_swa != 0.

I modified my patch to explicitly store zero sliding_window in case it's null in config.json and use the zero value to distinguish Phi-4 from other PHI3-based models.

convert_hf_to_gguf.py

compilade · 2024-12-14T04:43:34Z

convert_hf_to_gguf.py

+        if self.metadata.name == "Phi 4":
+            return self._set_vocab_gpt2()


Alternatively, self._set_vocab_gpt2() could be called when tokenizer.model is missing here, regardless of the model name.

I modified the solution to check value of tokenizer_class from tokenizer_config.json and call self._set_vocab_gpt2() if it's GPT2Tokenizer.

…om other PHI3 models

…models

convert-hf : use GPT2 vocab and ignore sliding_window hparam for Phi-…

7555ab1

…4 model llama : use regular (not a sliding window) attention mask for Phi-4 model

github-actions bot added the python python script changes label Dec 13, 2024

slaren reviewed Dec 14, 2024

View reviewed changes

compilade reviewed Dec 14, 2024

View reviewed changes

sszymczy added 3 commits December 14, 2024 11:28

convert-hf : do not use model name to distinguish Phi-4 from Phi-3

520e8a0

convert-hf : use zero value of sliding_window to distinguish Phi-4 fr…

c7fdbd3

…om other PHI3 models

llama : use zero value of n_swa to distinguish Phi-4 from other PHI3 …

046c0d7

…models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Microsoft Phi-4 model #10817

Add support for Microsoft Phi-4 model #10817

fairydreaming commented Dec 13, 2024 •

edited

Loading

slaren Dec 14, 2024

fairydreaming Dec 14, 2024

compilade Dec 14, 2024

fairydreaming Dec 14, 2024

		if self.metadata.name == "Phi 4":
		return self._set_vocab_gpt2()

Add support for Microsoft Phi-4 model #10817

Are you sure you want to change the base?

Add support for Microsoft Phi-4 model #10817

Conversation

fairydreaming commented Dec 13, 2024 • edited Loading

slaren Dec 14, 2024

Choose a reason for hiding this comment

fairydreaming Dec 14, 2024

Choose a reason for hiding this comment

compilade Dec 14, 2024

Choose a reason for hiding this comment

fairydreaming Dec 14, 2024

Choose a reason for hiding this comment

fairydreaming commented Dec 13, 2024 •

edited

Loading