-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About lora duplication #19
Comments
I have tried printing the trainable parameters at the specified location as you mentioned. However, after calling merge_and_unload(), the LoRA adapters are not visible, contrary to what you've shown. Could you please provide guidance on how to reproduce it? |
Thanks for your kind replying! This is way of printing my trainable parameters:
It can print under the zero2 mode of deepspeed. |
Can you start the training for stage 3 ? If the LoRA at this location hasn't been merged, then the subsequent line |
Thanks for replying! The all parameters does not change. TAT |
The implementation of the print_trainable_parameters function seems to have a bug. When using DeepSpeed, param.numel() might return 0. I haven't found a solution for it yet. I'd appreciate any suggestions to address this. Nevertheless, this issue shouldn't hinder the training process. |
It can print param under the zero2 mode of deepspeed. Hope it is helpful to you |
Hello! I followed your method to load different Lora modules at different stages:
model.get_model().initialize_vision_modules(model_args)
model = load_lora(model, model_args.stage2_path)
rank0_print('Merging LoRA weights...')
model = model.merge_and_unload()
print_trainable_parameters(model)
rank0_print("Adding LoRA adapters...")
model = get_peft_model(model, lora_config)
but when I print the parameters, I still only see one Lora. Is there any trick in the code settings that I might be missing?
The text was updated successfully, but these errors were encountered: