Restart a training job without the file trainer.pth #299
-
Hello, Thank you in advance for any help. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
Hi @LucaBrugnoli , What files do you still have? Your original configuration YAML file? |
Beta Was this translation helpful? Give feedback.
-
Basically, with your original YAML config (assuming you set seeds) you can deterministically reconstruct your training and validation sets, so all you have to do is make a copy and add the You can also check your log/wandb for learning rate drops, and make sure to start at the learning rate you stopped at. This won't give a perfect restart like |
Beta Was this translation helpful? Give feedback.
Basically, with your original YAML config (assuming you set seeds) you can deterministically reconstruct your training and validation sets, so all you have to do is make a copy and add the
initialize_from_state
model builder to start this new training session from the weights you still have inbest_model.pth
. (See #235 , #243, #205 for examples.)You can also check your log/wandb for learning rate drops, and make sure to start at the learning rate you stopped at.
This won't give a perfect restart like
trainer.pth
would, but should still be enough to get reasonable results, unless I've forgotten something important...