You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue resolved. The problem is that when constructing the trainer, `save_safetensors=False` should be set. Otherwise, the above `safe_serialization=False` will not work.
I use gemma2-2b-it with transformers, it reports 'No such file or directory: pytorch_model.bin'. Then I following the instructions of @WilliamYi96, but it still not work. Can somebody help me? Here is my code:
trainer = transformers.Trainer(
model=model,
train_dataset=train_data,
eval_dataset=val_data,
args=transformers.TrainingArguments(
per_device_train_batch_size=args.micro_batch_size,
gradient_accumulation_steps=gradient_accumulation_steps,
warmup_steps=100, # 100 ori
num_train_epochs=args.num_epochs,
learning_rate=args.learning_rate,
fp16=True, # not torch.cuda.is_bf16_supported()
bf16=False, # torch.cuda.is_bf16_supported()
logging_steps=10,
logging_first_step=True,
optim="adamw_torch",
evaluation_strategy="steps",
save_strategy="steps",
eval_steps=100,
save_steps=200,
output_dir=args.output_dir,
save_total_limit=20,
max_grad_norm=1.0,
report_to="none",
load_best_model_at_end=True,
# lr_scheduler_type="linear",
ddp_find_unused_parameters=False if ddp else None,
group_by_length=args.group_by_length,
run_name=args.output_dir.split('/')[-1],
metric_for_best_model="{}_loss".format(args.data_path),
save_safetensors=False
),
data_collator=transformers.DataCollatorForSeq2Seq(
tokenizer, pad_to_multiple_of=8, return_tensors="pt", padding=True
),
)
model.config.use_cache = False # silence the warnings. Please re-enable for inference!
trainer.train()
if args.save_model:
output_lora_dir = '/public/MountData/yaolu/LLM_pretrained/pruned_model/partial_tuing_alpaca_{}_{}/'.format(args.base_model, args.partial_layer_name)
if not os.path.exists(output_lora_dir):
os.mkdir(output_lora_dir)
model.save_pretrained(output_lora_dir, safe_serialization=False)
The text was updated successfully, but these errors were encountered:
yaolu-zjut
changed the title
Issue resolved. The problem is that when constructing the trainer, save_safetensors=False should be set. Otherwise, the above safe_serialization=False will not work.
No such file or directory: pytorch_model.bin
Aug 29, 2024
https://huggingface.co/docs/transformers/v4.36.1/en/main_classes/trainer#transformers.TrainingArguments.save_safetensors
Originally posted by @WilliamYi96 in #45 (comment)
I use gemma2-2b-it with transformers, it reports 'No such file or directory: pytorch_model.bin'. Then I following the instructions of @WilliamYi96, but it still not work. Can somebody help me? Here is my code:
The text was updated successfully, but these errors were encountered: