Skip to content

Huge bumps in learning curves #74

Answered by tbraeckevelt
tbraeckevelt asked this question in Q&A
Discussion options

You must be logged in to vote

Thanks for the answer.

I tested your suggestions: learning rate (LR) 0.005 and 0.001, and batch size (BS) 10 and 15:

I come to the same conclusions as you did:

  • Increasing batch size did nothing to resolve the issue, it only made the training slower.
  • Decreasing the learning rate did lower (or remove) those peaks, but this appears to be a trade-off because LR=0.001 really slowed down the training. LR=0.005 lead to the lowest MAE, but it did still show a peak, although not to high (only lost two hours to get back to the initial MAE before the peak). LR=0.01 is a bit faster, but also a gamble to not encouter such a huge peak.

I will do some more test on other systems, but for now I will us…

Replies: 5 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@simonbatzner
Comment options

Answer selected by Linux-cpp-lisp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
question Further information is requested
3 participants
Converted from issue

This discussion was converted from issue #73 on August 24, 2021 15:33.