Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug] Encountered bug when using Torch-TensorRT with a pytorch segmentation model #3254

Open
deo-abhijit opened this issue Oct 21, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@deo-abhijit
Copy link

Hi! I have been using torch_tensorrt for speedup of pytorch models and have been loving it. But sometimes i face problems while conversion.

In this case, i was using segmentation-models-pytorch(smp) library.

import segmentation_models_pytorch as smp
import torch_tensorrt as trt
import torch

model = smp.create_model(
    arch="fpn",                     # name of the architecture, e.g. 'Unet'/ 'FPN' / etc. Case INsensitive!
    encoder_name="mit_b0",
    encoder_weights="imagenet",
    in_channels=3,
    classes=3,
).eval().to('cuda')

input_data = torch.randn(1,3,224,224,requires_grad=False).to('cuda')
scripted_model = torch.jit.trace(model, input_data )
trt_model = trt.compile(
                scripted_model,
                inputs = [trt.Input((1,3,736,1280),precision = torch.float32)],
                enabled_precisions={torch.float32},truncate_long_and_double = True)

I got the following error.

WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
WARNING:torch_tensorrt._compile:Input is a torchscript module but the ir was not specified (default=dynamo), please set ir=torchscript to suppress the warning.
WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
ERROR: [Torch-TensorRT TorchScript Conversion Context] - [graphShapeAnalyzer.cpp::checkCalculationStatusSanity::1660] Error Code 2: Internal Error (Assertion !isPartialWork(p.second.symbolicRep) failed. )
ERROR: [Torch-TensorRT TorchScript Conversion Context] - [graphShapeAnalyzer.cpp::checkCalculationStatusSanity::1660] Error Code 2: Internal Error (Assertion !isPartialWork(p.second.symbolicRep) failed. )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mzcar/miniconda3/envs/tensorrt/lib/python3.10/site-packages/torch_tensorrt/_compile.py", line 208, in compile
    compiled_ts_module: torch.jit.ScriptModule = torchscript_compile(
  File "/home/mzcar/miniconda3/envs/tensorrt/lib/python3.10/site-packages/torch_tensorrt/ts/_compiler.py", line 156, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at core/conversion/converters/converter_util.cpp:270] Expected const_layer to be true but got false

This especially occurs when im using mit_b0 as backbone. for other resnet based backbone, im getting good speedups.

Hence I am not much dependant on this backbone but i would love to know why this conversion is failing. If anyone could help in this, it would be helpful.

If anyone interested, output for pip freeze is

autocommand==2.2.2
backports.tarfile==1.2.0
certifi==2024.8.30
charset-normalizer==3.4.0
coloredlogs==15.0.1
contourpy==1.3.0
cycler==0.12.1
efficientnet_pytorch==0.7.1
filelock==3.13.1
flatbuffers==24.3.25
fonttools==4.54.1
fsspec==2024.2.0
huggingface-hub==0.25.2
humanfriendly==10.0
idna==3.10
importlib_metadata==8.0.0
importlib_resources==6.4.0
inflect==7.3.1
jaraco.collections==5.1.0
jaraco.context==5.3.0
jaraco.functools==4.0.1
jaraco.text==3.12.1
Jinja2==3.1.3
kiwisolver==1.4.7
Mako==1.3.5
MarkupSafe==2.1.5
matplotlib==3.9.2
more-itertools==10.3.0
mpmath==1.3.0
munch==4.0.0
networkx==3.2.1
numpy==1.26.3
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-cupti-cu11==11.8.87
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
nvidia-cuda-runtime-cu12==12.6.77
nvidia-cudnn-cu11==9.1.0.70
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.3.0.86
nvidia-cusolver-cu11==11.4.1.48
nvidia-cusparse-cu11==11.7.5.86
nvidia-nccl-cu11==2.20.5
nvidia-nvtx-cu11==11.8.86
onnx==1.17.0
onnx_tensorrt==10.5.0
onnxruntime-gpu==1.19.2
openvino==2023.1.0.dev20230811
openvino-telemetry==2024.1.0
packaging==24.1
pandas==2.2.3
pillow==10.2.0
platformdirs==4.3.6
pretrainedmodels==0.7.4
protobuf==5.28.2
pycuda==2024.1.2
pyparsing==3.2.0
python-dateutil==2.9.0.post0
pytools==2024.1.14
pytz==2024.2
PyYAML==6.0.2
regex==2024.9.11
requests==2.32.3
safetensors==0.4.5
scipy==1.14.1
seaborn==0.13.2
segmentation-models-pytorch==0.3.4
six==1.16.0
sympy==1.13.1
tensorrt==10.1.0
tensorrt-cu12==10.5.0
tensorrt-cu12-bindings==10.1.0
tensorrt-cu12-libs==10.1.0
timm==0.9.7
tokenizers==0.20.1
tomli==2.0.1
torch==2.4.1+cu118
torch_tensorrt==2.4.0+cu118
torchvision==0.19.1+cu118
tqdm==4.66.5
transformers==4.45.2
triton==3.0.0
typeguard==4.3.0
typing_extensions==4.9.0
tzdata==2024.2
urllib3==2.2.3
zipp==3.19.2
@deo-abhijit deo-abhijit added the bug Something isn't working label Oct 21, 2024
@deo-abhijit deo-abhijit changed the title 🐛 [Bug] Encountered bug when using Torch-TensorRT with a segmentation model 🐛 [Bug] Encountered bug when using Torch-TensorRT with a pytorch segmentation model Oct 21, 2024
@narendasan
Copy link
Collaborator

@deo-abhijit have you tried using the dynamo frontend instead of torchscript? might resolve this issue. You can still use torchscript for deployment after by tracing the compiled program with torch.jit.script

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants