This is a TensorRT version Unet, inspired by tensorrtx and pytorch-unet.
You can generate TensorRT engine file using this script and customize some params and network structure based on network you trained (FP32/16 precision, input size, different conv, activation function...)
TensorRT 7.0 (you need to install tensorrt first)
Cuda 10.2
Python3.7
opencv 4.4
cmake 3.18
pip install -r requirements.txt
train your dataset by following pytorch-unet and generate .pth file.
run gen_wts from utils folder, and move it to project folder
mkdir build
cd build
cmake ..
make
unet -s
then a unet exec file will generated, you can use unet -d to infer files in a folder
unet -d ../samples
the speed of tensorRT engine is much faster
pytorch | TensorRT FP32 | TensorRT FP16 |
---|---|---|
816x672 | 816x672 | 816x672 |
58ms | 43ms (batchsize 8) | 14ms (batchsize 8) |
- add INT8 calibrator
- add custom plugin
etc