Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirements for Model Merging #475

Closed
eunbin079 opened this issue Dec 20, 2024 · 2 comments
Closed

Requirements for Model Merging #475

eunbin079 opened this issue Dec 20, 2024 · 2 comments

Comments

@eunbin079
Copy link

What common characteristics should models have to be eligible for task arithmetic merging?
Besides having the same architecture in the model config.json, what other conditions should be met? Can you provide a guide?

@cg123
Copy link
Collaborator

cg123 commented Dec 28, 2024

The models should have the same architecture and the same parameter sizes. A better way to think of it, though, is that to merge models they all need to have been fine tuned from some common "ancestor" model. For example, any set of models that can be traced back to Mistral-v0.2-7B (through however long a chain of fine tuning and merging) should be compatible. On the other hand, a model fine tuned from a Yi base and one from a Llama base will never be compatible even if the architecture and parameter sizes are exactly the same.

Hope this helps clear things up a bit.

@eunbin079
Copy link
Author

thank for replying !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants