Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging LoRA adapters: "TypeError: object of type 'NoneType' has no len()" #470

Closed
anika-ilieva opened this issue Dec 9, 2024 · 2 comments

Comments

@anika-ilieva
Copy link

Hi, I am trying to merge different LoRA adapters. Currently, I am trying to merge adapters ONLY that do not contain the base model. Whenever the adapters are used, they are loaded on top of the "unsloth/Phi-3-mini-4k-instruct" base model. I have been experimenting with different .yml configurations with and without using slices:

Example yml using slices
slices:

  • sources:
    • model: HU-Berlin-ML-Internal/opiniongpt-phi3-middle_east
      layer_range: [0, -1]
      weight: 0.3
    • model: HU-Berlin-ML-Internal/opiniongpt-phi3-men
      layer_range: [0, -1]
      weight: 0.25
      merge_method: linear
      base_model: unsloth/Phi-3-mini-4k-instruct

Example yml without slices
models:

  • model: HU-Berlin-ML-Internal/opiniongpt-phi3-middle_east
    parameters:
    weight: 0.3
    density: 0.9
  • model: HU-Berlin-ML-Internal/opiniongpt-phi3-men
    parameters:
    weight: 0.25
    density: 0.9
    merge_method: della_linear
    base_model: unsloth/Phi-3-mini-4k-instruct

Both lead to the following error message when I run mergekit-yaml examples/lora_test.yml ./merged --lazy-unpickle --allow-crimes:

Traceback (most recent call last):
  File "/Users/anikailieva/anaconda3/bin/mergekit-yaml", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/Documents/mergekit/mergekit/options.py", line 82, in wrapper
    f(*args, **kwargs)
  File "/Users/anikailieva/Documents/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
    run_merge(
  File "/Users/anikailieva/Documents/mergekit/mergekit/merge.py", line 50, in run_merge
    model_arch_info = [
                      ^
  File "/Users/anikailieva/Documents/mergekit/mergekit/merge.py", line 51, in <listcomp>
    get_architecture_info(m.config(trust_remote_code=options.trust_remote_code))
  File "/Users/anikailieva/Documents/mergekit/mergekit/architecture.py", line 359, in get_architecture_info
    if len(config.architectures) != 1:
       ^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()

From this I understand that the way I write the yaml file leads to the assumption that "opiniongpt-phi3-middle_east" and "opiniongpt-phi3-men" include a base model that has an architecture. But I am currently only interested in merging the adapters. I would later load the merged adapter on top of the base model. Could you tell me if this is possible and how should the yaml file look like in my case?

Thank you so much in advance!

@cg123
Copy link
Collaborator

cg123 commented Dec 29, 2024

Hey! Mergekit currently doesn't merge bare adapters by themselves. There's actually some functionality for this in PEFT - check out the documentation here: https://huggingface.co/docs/peft/en/developer_guides/model_merging

If that doesn't work for you though, you can fudge it by merging the base+lora models then extracting a lora from the result. Something like this:

models:
  - model: unsloth/Phi-3-mini-4k-instruct+HU-Berlin-ML-Internal/opiniongpt-phi3-middle_east
    parameters:
      weight: 0.3
      density: 0.9
  - model: unsloth/Phi-3-mini-4k-instruct+HU-Berlin-ML-Internal/opiniongpt-phi3-men
    parameters:
      weight: 0.25
      density: 0.9
merge_method: della_linear
base_model: unsloth/Phi-3-mini-4k-instruct

And then mergekit-extract-lora output_model_path unsloth/Phi-3-mini-4k-instruct ./extracted_lora --rank=[i dunno, maybe 16?]

@anika-ilieva
Copy link
Author

Thank you a lot for the suggested alternatives! I will try them out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants