Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor checkpiont logic #3302

Merged
merged 10 commits into from
Apr 15, 2024

Conversation

vinnamkim
Copy link
Contributor

@vinnamkim vinnamkim commented Apr 11, 2024

Summary

  • Ticket no. 138353

  • The main objective of this PR is to modify the logic for saving and loading model checkpoints in a manner that aligns more closely with the recommendations from PyTorch Lightning. We need to customize them to package our metadata objects (label_info, otx_version, ...). For more information, refer to the following PyTorch Lightning documentation:

    1. on_save_checkpoint
    2. on_load_checkpoint
  • Another motivation for this change is to prevent errors during PyTorch tracing. If we manually include an arbitrary object that is not an nn.Parameter in nn.Module.state_dict(), it triggers an error when executing torch.onnx.export.

  • Additionally, this PR revisits the Engine pipelines to address the following TODO comments in the code:

    # TODO (vinnamki): This should be changed to raise an error if not equivalent in case of test
    # raise ValueError()

How to test

Also modified the existing tests according to this change.

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added e2e tests for validation.
  • I have added the description of my changes into CHANGELOG in my target branch (e.g., CHANGELOG in develop).​
  • I have updated the documentation in my target branch accordingly (e.g., documentation in develop).
  • I have linked related issues.

License

  • I submit my code changes under the same Apache License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

@github-actions github-actions bot added TEST Any changes in tests OTX 2.0 labels Apr 11, 2024
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
@vinnamkim vinnamkim force-pushed the refactor-checkpoint-logic branch from 051caa9 to c96258e Compare April 15, 2024 02:12
Signed-off-by: Kim, Vinnam <[email protected]>
@vinnamkim vinnamkim merged commit fb69fcb into openvinotoolkit:develop Apr 15, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TEST Any changes in tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants