You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What common characteristics should models have to be eligible for task arithmetic merging?
Besides having the same architecture in the model config.json, what other conditions should be met? Can you provide a guide?
The text was updated successfully, but these errors were encountered:
The models should have the same architecture and the same parameter sizes. A better way to think of it, though, is that to merge models they all need to have been fine tuned from some common "ancestor" model. For example, any set of models that can be traced back to Mistral-v0.2-7B (through however long a chain of fine tuning and merging) should be compatible. On the other hand, a model fine tuned from a Yi base and one from a Llama base will never be compatible even if the architecture and parameter sizes are exactly the same.
What common characteristics should models have to be eligible for task arithmetic merging?
Besides having the same architecture in the model config.json, what other conditions should be met? Can you provide a guide?
The text was updated successfully, but these errors were encountered: