Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛[BUG]: Running corrdiff/generate.py raises a shape exception #542

Open
Tracked by #589
gideonite opened this issue May 31, 2024 · 3 comments
Open
Tracked by #589

🐛[BUG]: Running corrdiff/generate.py raises a shape exception #542

gideonite opened this issue May 31, 2024 · 3 comments
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@gideonite
Copy link

Version

latest

On which installation method(s) does this occur?

No response

Describe the issue

See log output below

Minimum reproducible example

No response

Relevant log output

Error executing job with overrides: ['dataset.data_path=/data/gideond/corrdiff_inference_package/dataset/2023-01-24-cwb-4years_5times.zarr', 'res_ckpt_filename=/data/gideond/corrdiff_inference_package/checkpoints/diffusion.mdlus', 'reg_ckpt_filename=/data/gideond/corrdiff_inference_package/checkpoints/regression.mdlus', 'seed_batch_size=5', 'use_torch_compile=false']
Traceback (most recent call last):
  File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 310, in main
    generate_and_save(
  File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 396, in generate_and_save
    image_out = generate_fn(image_lr)
                ^^^^^^^^^^^^^^^^^^^^^
  File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 232, in generate_fn
    image_reg = generate(
                ^^^^^^^^^
  File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 541, in generate
    images = sampler_fn(
             ^^^^^^^^^^^
  File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 609, in unet_regression
    x_next = net(x_hat[0:1], x_lr, t_hat, class_labels).to(torch.float64)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/modulus/models/diffusion/unet.py", line 152, in forward
    F_x = self.model(
          ^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/nvtx/nvtx.py", line 116, in inner
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/modulus/models/diffusion/song_unet.py", line 347, in forward
    x = block(x, emb) if isinstance(block, UNetBlock) else block(x)
                                                           ^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/modulus/models/diffusion/layers.py", line 224, in forward
    x = torch.nn.functional.conv2d(x, w, padding=w_pad)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Given groups=1, weight of size [128, 20, 3, 3], expected input[1, 16, 448, 448] to have 20 channels, but got 16 channels instead

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Environment details

No response

@gideonite gideonite added ? - Needs Triage Need team to review and classify bug Something isn't working labels May 31, 2024
@windsoryin
Copy link

i also have same error raised using CorrDiff Inference Package

@windsoryin
Copy link

This is due to the wrong arguments in config_generate.yaml, the input channels: [0, 1, 2, 3, 4, 9, 10, 11, 12, 17, 18, 19] didn't match the 20 channels used in pre-trained models. what's more, they are also overlapped with output channels, [0, 17, 18, 19].

@zomosky
Copy link

zomosky commented Nov 1, 2024

Version

latest

On which installation method(s) does this occur?

No response

Describe the issue

See log output below

Minimum reproducible example

No response

Relevant log output

Error executing job with overrides: ['dataset.data_path=/data/gideond/corrdiff_inference_package/dataset/2023-01-24-cwb-4years_5times.zarr', 'res_ckpt_filename=/data/gideond/corrdiff_inference_package/checkpoints/diffusion.mdlus', 'reg_ckpt_filename=/data/gideond/corrdiff_inference_package/checkpoints/regression.mdlus', 'seed_batch_size=5', 'use_torch_compile=false']
Traceback (most recent call last):
File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 310, in main
generate_and_save(
File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 396, in generate_and_save
image_out = generate_fn(image_lr)
^^^^^^^^^^^^^^^^^^^^^
File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 232, in generate_fn
image_reg = generate(
^^^^^^^^^
File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 541, in generate
images = sampler_fn(
^^^^^^^^^^^
File "/net/nfs.cirrascale/climate/gideond/home/projects/modulus/examples/generative/corrdiff/generate.py", line 609, in unet_regression
x_next = net(x_hat[0:1], x_lr, t_hat, class_labels).to(torch.float64)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/modulus/models/diffusion/unet.py", line 152, in forward
F_x = self.model(
^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/nvtx/nvtx.py", line 116, in inner
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/modulus/models/diffusion/song_unet.py", line 347, in forward
x = block(x, emb) if isinstance(block, UNetBlock) else block(x)
^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gideond/.conda/envs/corrdiff/lib/python3.12/site-packages/modulus/models/diffusion/layers.py", line 224, in forward
x = torch.nn.functional.conv2d(x, w, padding=w_pad)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Given groups=1, weight of size [128, 20, 3, 3], expected input[1, 16, 448, 448] to have 20 channels, but got 16 channels instead

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Environment details

No response

这个bug可能是来源于train.py line 127左右的N_grid_channels参数,他改变了整个模型的输入形状,当我去掉该参数,并注释掉train.py line 144对于该参数的引用之后该bug就消失了,并且output channels也应该改为cfg.dataset.out_channels = [0,1,2,3]。
This bug may come from the parametern _ grid _ channels around train.py line 127, which changed the input shape of the whole model. When I removed this parameter and commented out the reference of train.py line 144, the bug disappeared, and output channels need to changged cfg.dataset.out_channels = [0,1,2,3].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants