Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification and Potential Issue with EarlyStopping Mechanism #26

Open
Zhu-Luyu opened this issue Apr 4, 2024 · 1 comment
Open

Clarification and Potential Issue with EarlyStopping Mechanism #26

Zhu-Luyu opened this issue Apr 4, 2024 · 1 comment

Comments

@Zhu-Luyu
Copy link

Zhu-Luyu commented Apr 4, 2024

I have two concerns regarding the implementation of the EarlyStopping mechanism in your project:

  1. Adjustment of the delta Value After Reducing the Learning Rate: After the patience threshold is met and the learning rate is adjusted (reduced to one-tenth of its original value), the delta value for the EarlyStopping mechanism is changed from -0.001 to -0.002. Could you clarify the rationale behind making the delta value more stringent (-0.002) after adjusting the learning rate? This adjustment seems to require the model to exhibit a more significant improvement than before to avoid being considered as having "limited improvement". I think a smaller learning rate results in a more conservative increase in accuracy.

  2. Potential Issue with Resetting self.score_max Upon EarlyStopping Re-instantiation: When executing early_stopping = EarlyStopping(patience=opt.earlystop_epoch, delta=-0.002, verbose=True) in train.py, the self.score_max variable is reset to -np.Inf. This reset could potentially lead to a scenario where the "best" weights saved after re-instantiating the EarlyStopping object might not actually be better than the weights saved before, considering that self.score_max no longer retains its previous value but is instead reset. Shouldn't self.score_max be preserved across re-instantiations to ensure that only genuinely better model states are saved? This behavior seems like it might be a bug, as it contradicts the purpose of tracking the best model performance across training epochs.

Looking forward to your insights on these points.

@PeterWang512
Copy link
Owner

Thanks for the questions!

  1. I didn't try extensively with the delta values, so feel free to adjust them to improve your results.
  2. Yes, the edge case you mentioned could happen, but I did not see this affect the model performance drastically personally.

I hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants