Skip to content

Commit

Permalink
RWKV v6: Add time_mix_decay_w1/w2 in quant exclusion list (ggerganov#…
Browse files Browse the repository at this point in the history
…9387)

Signed-off-by: Molly Sophia <[email protected]>
  • Loading branch information
MollySophia authored and arthw committed Nov 15, 2024
1 parent cd2cb5f commit 4d868ce
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 0 deletions.
2 changes: 2 additions & 0 deletions convert_hf_to_gguf.py
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,8 @@ def prepare_tensors(self):
gguf.MODEL_TENSOR.TIME_MIX_FIRST,
gguf.MODEL_TENSOR.TIME_MIX_W1,
gguf.MODEL_TENSOR.TIME_MIX_W2,
gguf.MODEL_TENSOR.TIME_MIX_DECAY_W1,
gguf.MODEL_TENSOR.TIME_MIX_DECAY_W2,
)
)
or not new_name.endswith(".weight")
Expand Down
2 changes: 2 additions & 0 deletions src/llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17534,6 +17534,8 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
quantize &= name.find("time_mix_first.weight") == std::string::npos;
quantize &= name.find("time_mix_w1.weight") == std::string::npos;
quantize &= name.find("time_mix_w2.weight") == std::string::npos;
quantize &= name.find("time_mix_decay_w1.weight") == std::string::npos;
quantize &= name.find("time_mix_decay_w2.weight") == std::string::npos;

// do not quantize relative position bias (T5)
quantize &= name.find("attn_rel_b.weight") == std::string::npos;
Expand Down

0 comments on commit 4d868ce

Please sign in to comment.