Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow continuation of model training #386

Merged
merged 2 commits into from
Jun 5, 2024

Conversation

benjijamorris
Copy link
Contributor

What does this PR do?

  • allow passing of weights_only top-level config arg to resume training from model weights (ignoring lightning optimizer metadata etc.)

Before submitting

  • Did you make sure title is self-explanatory and the description concisely explains the PR?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you test your PR locally with pytest command?
  • Did you run pre-commit hooks with pre-commit run -a command?

Did you have fun?

Make sure you had fun coding 🙃

cyto_dl/train.py Outdated
if cfg.get("weights_only"):
# load model from state dict to get around trainer.max_epochs limit, useful for resuming model training from existing weights
ckpt_path = cfg.get("ckpt_path")
state_dict = torch.load(ckpt_path)["state_dict"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any logic validating that when weights_only, the other keys need to exist?

@@ -97,9 +97,11 @@ def train(cfg: DictConfig) -> Tuple[dict, dict]:
log.info("Starting training!")

if cfg.get("weights_only"):
assert cfg.get(
"ckpt_path"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Any unit tests that exercise this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manual testing only at the moment... at the moment I'm reluctant to add more unit tests when we're already timing out on windows.

@benjijamorris benjijamorris merged commit 4e8677d into main Jun 5, 2024
1 of 6 checks passed
@benjijamorris benjijamorris deleted the feature/weight_only_loading branch June 5, 2024 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants