Requirements for Model Merging #475

eunbin079 · 2024-12-20T09:06:32Z

What common characteristics should models have to be eligible for task arithmetic merging?
Besides having the same architecture in the model config.json, what other conditions should be met? Can you provide a guide?

cg123 · 2024-12-28T23:41:28Z

The models should have the same architecture and the same parameter sizes. A better way to think of it, though, is that to merge models they all need to have been fine tuned from some common "ancestor" model. For example, any set of models that can be traced back to Mistral-v0.2-7B (through however long a chain of fine tuning and merging) should be compatible. On the other hand, a model fine tuned from a Yi base and one from a Llama base will never be compatible even if the architecture and parameter sizes are exactly the same.

Hope this helps clear things up a bit.

eunbin079 · 2024-12-31T03:44:09Z

thank for replying !!

eunbin079 closed this as completed Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirements for Model Merging #475

Requirements for Model Merging #475

eunbin079 commented Dec 20, 2024

cg123 commented Dec 28, 2024

eunbin079 commented Dec 31, 2024

Requirements for Model Merging #475

Requirements for Model Merging #475

Comments

eunbin079 commented Dec 20, 2024

cg123 commented Dec 28, 2024

eunbin079 commented Dec 31, 2024