Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shape Mismatch Error in LSTM Forward Pass (RuntimeError: output with shape [1, 512] doesn't match the broadcast shape [1, 1, 512]) #55

Open
LOPES3000 opened this issue Oct 23, 2024 · 1 comment

Comments

@LOPES3000
Copy link

I'm encountering a shape mismatch error when trying to use the OpenLSTML4casadi model within the RealTimeL4CasADi wrapper. The error occurs when calling the forward method of the LSTM, which is part of the OpenLSTML4casadi class. Despite reshaping inputs and initializing the hidden/cell states according to the LSTM's requirements, the error persists.

Error Traceback:

RuntimeError: output with shape [1, 512] doesn't match the broadcast shape [1, 1, 512]

Here’s the full stack trace:

File "l4casadi/realtime/realtime_l4casadi.py", line 75, in get_params
    params = self._get_params(a_t)
File "l4casadi/realtime/realtime_l4casadi.py", line 66, in _get_params
    df_a, f_a = batched_jacobian(self.model, a_t, return_func_output=True)
File "l4casadi/realtime/sensitivities.py", line 43, in batched_jacobian
    return functorch.vmap(functorch.jacrev(aux_function(func), has_aux=True), randomness=vmap_randomness)(inputs[:, None])
File "torch/_functorch/vmap.py", line 434, in wrapped
    return _flat_vmap(func, batch_size, flat_in_dims, flat_args, args_spec, out_dims, randomness, **kwargs)
File "torch/_functorch/vmap.py", line 619, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
File "torch/_functorch/eager_transforms.py", line 291, in _vjp_with_argnums
    primals_out = func(*primals)
File "l4casadi/realtime/sensitivities.py", line 13, in aux_function.<locals>.inner_aux
    out = func(inputs)
File "torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
File "open_lstm_l4casadi.py", line 53, in forward
    y_sim, _ = self.model(u_train, state)
File "torch/nn/modules/rnn.py", line 812, in forward
    result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers, self.dropout, self.training, self.bidirectional, self.batch_first)

Steps to Reproduce:

  1. Initialize the OpenLSTML4casadi model with the following parameters:
    • n_context = 1
    • n_inputs = 1
    • sequence_length = 1
    • batch_size = 1
  2. Wrap the model using RealTimeL4CasADi for CasADi integration.
  3. Call the get_params method using the following input:
    casadi_param = model_l4c.get_params(np.ones((n_inputs, batch_size * sequence_length)))

What I’ve Tried:

  • I ensured that the input tensor (u_train) is reshaped to [sequence_length, batch_size, input_size] before being passed to the LSTM.
  • I initialized the hidden and cell states (hn and cn) with the correct shape: [num_layers, batch_size, hidden_size] for cn and [num_layers, batch_size, proj_size] for hn.
  • Despite this, the error persists when calling the LSTM's forward pass.

Possible Cause:

The issue may be related to how the LSTM is handling projected outputs (proj_size=1) and internal reshaping of tensors during the batched Jacobian calculation. The shape mismatch suggests that the LSTM is returning an output tensor with an unexpected shape, which doesn't match the expected broadcasting dimensions.

Expected Behavior:

The LSTM forward pass should work without shape mismatches, and the batched Jacobian should correctly handle the model's projected output when using RealTimeL4CasADi.

Environment:

  • Python version: 3.10.12
  • PyTorch version: 2.0.0+cpu
  • CasADi version: 3.6.6
  • OS: Windows 10 (WSL: Ubuntu-22.04]

Additional Context:

The full project code is available here: https://github.com/LOPES3000/RealtimeL4casadi_and_LSTM_NN/. This issue arises when integrating the LSTM model into the real-time CasADi wrapper for symbolic differentiation and optimization.

Request:

I’d appreciate any insights or suggestions on how to resolve this shape mismatch issue during the LSTM forward pass or how to modify the batched Jacobian calculation to account for projected LSTM outputs.

Thank you!

@1376787
Copy link

1376787 commented Feb 8, 2025

I got the same error when I'm trying to get_param of model with LSTM.
Did you resolve the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants