Given a first sequence of a sine wave, let the model generate the rest of the wave
Keras LSTM model to learn raw sine waves.
Subtle changes on 'n_steps' and 'hid_dim' alters the fit of the model greatly -> must find a balance between bias (underfitting) and variance (overfitting)
Always start with a simpler problem. Pytorch LSTM model to learn simple number sequence of length 3.
sequence | label |
---|---|
[10, 20, 30] | 40 |
[20, 30, 40] | 50 |
[30, 40, 50] | 60 |
model with single LSTM layer + single FC layer seems to underfit;
bigger hid_dim == more accurate the sequence generation (epoch 1000)
around hid_dim 200 (left), generation accuracy is at tolerable level around 500 (right), maybe the model starts to overfit
-> add one more FC layer to incrase complexity? (LSTM_2FC)
Doesn't seem like working well (flats out or shows huge bias). Maybe another FC layer is an overkill for learning simple sequence data?
- Using simple LSTM model, train raw sine wave (epoch 200, Grid-layout of hid_dim)
prediction is pretty accurate.
- Now, train sine wave with Gaussian noise
Training data
less accurate sine wave is generated now (single exp - epoch 200, hid_dim 50)
perhaps, more complex model is able to learn more complete sine wave from the noises? → Not really. Now, hid_dim 20 learns best
- Find appropriate model capacity
- Changes every time according to the type of data
- Use grid layout for initial search, then hand tuning
- Train for enough epochs to reduce loss as low as possible
- Using SGD gave flat-out result (underfitting perhaps) -> Changed to Adam to make model converge
optimizer matters!
- Additional hyperparameter tuning can be done in the similar way (learning rate, regularization, ...)