You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am trying to merge different LoRA adapters. Currently, I am trying to merge adapters ONLY that do not contain the base model. Whenever the adapters are used, they are loaded on top of the "unsloth/Phi-3-mini-4k-instruct" base model. I have been experimenting with different .yml configurations with and without using slices:
Both lead to the following error message when I run mergekit-yaml examples/lora_test.yml ./merged --lazy-unpickle --allow-crimes:
Traceback (most recent call last):
File "/Users/anikailieva/anaconda3/bin/mergekit-yaml", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anikailieva/Documents/mergekit/mergekit/options.py", line 82, in wrapper
f(*args, **kwargs)
File "/Users/anikailieva/Documents/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
run_merge(
File "/Users/anikailieva/Documents/mergekit/mergekit/merge.py", line 50, in run_merge
model_arch_info = [
^
File "/Users/anikailieva/Documents/mergekit/mergekit/merge.py", line 51, in <listcomp>
get_architecture_info(m.config(trust_remote_code=options.trust_remote_code))
File "/Users/anikailieva/Documents/mergekit/mergekit/architecture.py", line 359, in get_architecture_info
if len(config.architectures) != 1:
^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()
From this I understand that the way I write the yaml file leads to the assumption that "opiniongpt-phi3-middle_east" and "opiniongpt-phi3-men" include a base model that has an architecture. But I am currently only interested in merging the adapters. I would later load the merged adapter on top of the base model. Could you tell me if this is possible and how should the yaml file look like in my case?
Thank you so much in advance!
The text was updated successfully, but these errors were encountered:
Hi, I am trying to merge different LoRA adapters. Currently, I am trying to merge adapters ONLY that do not contain the base model. Whenever the adapters are used, they are loaded on top of the "unsloth/Phi-3-mini-4k-instruct" base model. I have been experimenting with different .yml configurations with and without using slices:
Example yml using slices
slices:
layer_range: [0, -1]
weight: 0.3
layer_range: [0, -1]
weight: 0.25
merge_method: linear
base_model: unsloth/Phi-3-mini-4k-instruct
Example yml without slices
models:
parameters:
weight: 0.3
density: 0.9
parameters:
weight: 0.25
density: 0.9
merge_method: della_linear
base_model: unsloth/Phi-3-mini-4k-instruct
Both lead to the following error message when I run mergekit-yaml examples/lora_test.yml ./merged --lazy-unpickle --allow-crimes:
From this I understand that the way I write the yaml file leads to the assumption that "opiniongpt-phi3-middle_east" and "opiniongpt-phi3-men" include a base model that has an architecture. But I am currently only interested in merging the adapters. I would later load the merged adapter on top of the base model. Could you tell me if this is possible and how should the yaml file look like in my case?
Thank you so much in advance!
The text was updated successfully, but these errors were encountered: