About lora duplication #19

yeppp27 · 2024-03-29T09:39:17Z

Hello! I followed your method to load different Lora modules at different stages:
model.get_model().initialize_vision_modules(model_args)
model = load_lora(model, model_args.stage2_path)
rank0_print('Merging LoRA weights...')
model = model.merge_and_unload()

print_trainable_parameters(model)

rank0_print("Adding LoRA adapters...")
model = get_peft_model(model, lora_config)

but when I print the parameters, I still only see one Lora. Is there any trick in the code settings that I might be missing?

huangb23 · 2024-03-29T12:35:58Z

I have tried printing the trainable parameters at the specified location as you mentioned. However, after calling merge_and_unload(), the LoRA adapters are not visible, contrary to what you've shown.

Could you please provide guidance on how to reproduce it?

yeppp27 · 2024-03-29T12:48:39Z

Thanks for your kind replying! This is way of printing my trainable parameters:

for name, param in model.named_parameters():
    print(f"{name}: {'requires_grad' if param.requires_grad else 'no grad'}")

It can print under the zero2 mode of deepspeed.
Hope it is helpful to you.

huangb23 · 2024-03-29T12:58:41Z

Can you start the training for stage 3 ? If the LoRA at this location hasn't been merged, then the subsequent line
model = get_peft_model(model, lora_config)
will throw an error.

yeppp27 · 2024-03-29T13:14:49Z

Thanks for replying!
It seems I did'nt load the second lora in the line of ''model = get_peft_model(model, lora_config)" step2
trainable params: 0 || all params: 3430316576 || trainable%: 0.00
Adding LoRA adapters...
step3
trainable params: 248061952 || all params: 3430316576 || trainable%: 7.23

The all parameters does not change. TAT

huangb23 · 2024-03-29T13:32:33Z

The implementation of the print_trainable_parameters function seems to have a bug. When using DeepSpeed, param.numel() might return 0. I haven't found a solution for it yet. I'd appreciate any suggestions to address this. Nevertheless, this issue shouldn't hinder the training process.

yeppp27 · 2024-03-29T13:38:58Z

It can print param under the zero2 mode of deepspeed. Hope it is helpful to you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About lora duplication #19

About lora duplication #19

yeppp27 commented Mar 29, 2024

huangb23 commented Mar 29, 2024

yeppp27 commented Mar 29, 2024

huangb23 commented Mar 29, 2024

yeppp27 commented Mar 29, 2024

huangb23 commented Mar 29, 2024

yeppp27 commented Mar 29, 2024

About lora duplication #19

About lora duplication #19

Comments

yeppp27 commented Mar 29, 2024

print_trainable_parameters(model)

huangb23 commented Mar 29, 2024

yeppp27 commented Mar 29, 2024

huangb23 commented Mar 29, 2024

yeppp27 commented Mar 29, 2024

huangb23 commented Mar 29, 2024

yeppp27 commented Mar 29, 2024