Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing the path when resuming from a checkpoint that differs from the one in train.json. #108

Open
azrahello opened this issue Dec 26, 2024 · 2 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@azrahello
Copy link
Contributor

First of all, I want to wish you a Merry Christmas and thank you for this wonderful Christmas gift! Amazing work! I noticed that when resuming training from a checkpoint, the continuation uses the directory specified in the train.json file, but it’s not necessarily the correct one, as the directory might actually be directory_+_timestamp.

@filipstrand
Copy link
Owner

Thank you, and happy Christmas to you too :)

Thanks for pointing this out, it should be fixed.

Another small bug I noted is the duration in the checkpoint.json can sometimes be wrong when starting and resuming training with some hours in-between runs, for example:

  1. Train for 1h
  2. Pause for 1h
  3. Train for 1h more

this will report a 3h duration, when it should be 2h.

This happens since we only log the start time, but it should probably work more like a stopwatch since training can be assumed to start and stop at arbitrary times.

@azrahello
Copy link
Contributor Author

Yes, I noticed it too. I started a training session in the evening and resumed it the next day, and the duration was indeed gross. I didn’t perceive it as an “issue.” In my fork, I shared the file plotter.py. If you feel like it, take a look—it’s my first approach to Python. I relied heavily on AIs for help, but I find it “fun.”

@filipstrand filipstrand added good first issue Good for newcomers bug Something isn't working labels Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants