We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
微调llava-llama3-8b的时候 从几个step后就开始loss=nan了 这个可能是什么原因呢?我看github issue也有人遇到类似问题 官方回复是改lr 我现在设置的
# Scheduler & Optimizer batch_size = 4 # per_device accumulative_counts = 32*4 dataloader_num_workers = 32 max_epochs = 1 optim_type = AdamW lr = 2e-6 param_scheduler = [ dict( type=LinearLR, start_factor=1e-5, by_epoch=True, begin=0, end=warmup_ratio * max_epochs, convert_to_iter_based=True), dict( type=CosineAnnealingLR, eta_min=0.0, by_epoch=True, begin=warmup_ratio * max_epochs, end=max_epochs, convert_to_iter_based=True) ]
The text was updated successfully, but these errors were encountered:
补充,修改过 clip->siglip
image_processor = dict( type=SiglipImageProcessor.from_pretrained, pretrained_model_name_or_path=visual_encoder_name_or_path, trust_remote_code=True) model = dict( type=LLaVAModel, freeze_llm=True, freeze_visual_encoder=True, llm=dict( type=AutoModelForCausalLM.from_pretrained, pretrained_model_name_or_path=llm_name_or_path, trust_remote_code=True), visual_encoder=dict( type=SiglipVisionModel.from_pretrained, pretrained_model_name_or_path=visual_encoder_name_or_path))
Sorry, something went wrong.
No branches or pull requests
微调llava-llama3-8b的时候 从几个step后就开始loss=nan了 这个可能是什么原因呢?我看github issue也有人遇到类似问题 官方回复是改lr 我现在设置的
The text was updated successfully, but these errors were encountered: