You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I perform full finetuning using a small dialogue dataset (500 samples from llava-instruct-80k), I find that Bunny's performance on llava in the wild deteriorates. Could this be solely due to the small size of the subsequent training dataset, or is it a parameter configuration issue? While I use the same dataset for full finetuning on LLaVA-1.5, its performance on llava in the wild improves. I’m unsure where the problem lies. The training script is as follows:
When I perform full finetuning using a small dialogue dataset (500 samples from llava-instruct-80k), I find that Bunny's performance on llava in the wild deteriorates. Could this be solely due to the small size of the subsequent training dataset, or is it a parameter configuration issue? While I use the same dataset for full finetuning on LLaVA-1.5, its performance on llava in the wild improves. I’m unsure where the problem lies. The training script is as follows:
MODEL_TYPE=phi-3
PRETRAIN_DIR=bunny-$MODEL_TYPE-pretrain
srun python bunny/train/train.py
--model_name_or_path Bunny-v1_1-4B
--model_type $MODEL_TYPE
--version phi3
--data_path train_data.json
--image_folder normal_loss_images
--vision_tower siglip-so400m-patch14-384
--use_s2 True
--mm_projector_type mlp2x_gelu
--image_aspect_ratio pad
--group_by_modality_length False
--bf16 True
--output_dir ./checkpoints/bunny-v1_1-4B-test
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 4
--gradient_accumulation_steps 8
--eval_strategy "no"
--save_strategy "epoch"
--save_steps 500
--save_total_limit 3
--learning_rate 2e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 4096
--gradient_checkpointing True
--dataloader_num_workers 2
--lazy_preprocess True
--report_to wandb
The text was updated successfully, but these errors were encountered: