Questions regarding full finetuning #145

Kiinory · 2024-12-31T07:45:14Z

When I perform full finetuning using a small dialogue dataset (500 samples from llava-instruct-80k), I find that Bunny's performance on llava in the wild deteriorates. Could this be solely due to the small size of the subsequent training dataset, or is it a parameter configuration issue? While I use the same dataset for full finetuning on LLaVA-1.5, its performance on llava in the wild improves. I’m unsure where the problem lies. The training script is as follows:

MODEL_TYPE=phi-3

PRETRAIN_DIR=bunny-$MODEL_TYPE-pretrain

srun python bunny/train/train.py
--model_name_or_path Bunny-v1_1-4B
--model_type $MODEL_TYPE
--version phi3
--data_path train_data.json
--image_folder normal_loss_images
--vision_tower siglip-so400m-patch14-384
--use_s2 True
--mm_projector_type mlp2x_gelu
--image_aspect_ratio pad
--group_by_modality_length False
--bf16 True
--output_dir ./checkpoints/bunny-v1_1-4B-test
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 4
--gradient_accumulation_steps 8
--eval_strategy "no"
--save_strategy "epoch"
--save_steps 500
--save_total_limit 3
--learning_rate 2e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 4096
--gradient_checkpointing True
--dataloader_num_workers 2
--lazy_preprocess True
--report_to wandb

Additionally, I unfroze all the model parameters in the code and only froze the vision_tower

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions regarding full finetuning #145

Questions regarding full finetuning #145

Kiinory commented Dec 31, 2024 •

edited

Loading

Questions regarding full finetuning #145

Questions regarding full finetuning #145

Comments

Kiinory commented Dec 31, 2024 • edited Loading

Kiinory commented Dec 31, 2024 •

edited

Loading