-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing batch_first attribute for LSTM model. #1
Comments
wow! As you said it is astonishing haha :) |
Cool. I am fixing it as well. Will try for a PR in a couple of days. |
good :) I will wait your PR. I think your contribution is more valuable than my fixing. |
When checking for batch sizes of input to the forward method of LSTM using
This shows that batch size is changing with different inputs. This would break the code in the forward method as Any way to make batch size consistent in the input. Also could you let me know the sources of inspiration for this code. That might help in fixing the issue quicker. |
Another issue that would need to be looked at would be Hidden/Cell state semantics: Since while training
|
By default the
BATCH_SIZE = 32
.Input to the LSTM from the CNN is of the shape
(32, 64, 16)
.The semantics of LSTM input are
(seq_len, batch_size, input_size)
.But the input format is
(batch_size, seq_len, input_size)
.To correct it
batch_first
needs to be passed True while creating the LSTM model.self.lstm = nn.LSTM(16, LSTM_MEMORY, 1, batch_first=True)
Astonishing part is model is still learning with this error.
The text was updated successfully, but these errors were encountered: