Skip to content

Latest commit

 

History

History
119 lines (102 loc) · 6.53 KB

README.md

File metadata and controls

119 lines (102 loc) · 6.53 KB

YOLOR

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks

PWC

To reproduce the results in the paper, please use this branch.

Model Test Size APtest AP50test AP75test APStest APMtest APLtest batch1 throughput
YOLOR-P6 1280 52.6% 70.6% 57.6% 34.7% 56.6% 64.2% 49 fps
YOLOR-W6 1280 54.1% 72.0% 59.2% 36.3% 57.9% 66.1% 47 fps
YOLOR-E6 1280 54.8% 72.7% 60.0% 36.9% 58.7% 66.9% 37 fps
YOLOR-D6 1280 55.4% 73.3% 60.6% 38.0% 59.2% 67.1% 30 fps
YOLOv4-P5 896 51.8% 70.3% 56.6% 33.4% 55.7% 63.4% 41 fps
YOLOv4-P6 1280 54.5% 72.6% 59.8% 36.6% 58.2% 65.5% 30 fps
YOLOv4-P7 1536 55.5% 73.4% 60.8% 38.4% 59.4% 67.7% 16 fps

To reproduce the inference speed, please see darknet.

Model Test Size APval AP50val AP75val APSval APMval APLval batch1 throughput
YOLOv4-CSP 640 49.1% 67.7% 53.8% 32.1% 54.4% 63.2% 76 fps
YOLOR-CSP 640 49.2% 67.6% 53.7% 32.9% 54.4% 63.0% weights
YOLOR-CSP* 640 50.0% 68.7% 54.3% 34.2% 55.1% 64.3% weights
YOLOv4-CSP-X 640 50.9% 69.3% 55.4% 35.3% 55.8% 64.8% 53 fps
YOLOR-CSP-X 640 51.1% 69.6% 55.7% 35.7% 56.0% 65.2% weights
YOLOR-CSP-X* 640 51.5% 69.9% 56.1% 35.8% 56.8% 66.1% weights

Convert to ONNX

yolor_csp_x*

  python convert_to_onnx.py --weights yolor_csp_x_star.pt --cfg cfg/yolor_csp_x.cfg --output yolo_csp_x_star.onnx
  python object_detector_onnx.py

You will get the results:

Convert to TensorRT

  /usr/src/tensorrt/bin/trtexec --onnx=yolor_csp_x_star.onnx \
                                --saveEngine=yolor_csp_x_star-fp16.trt \
                                --explicitBatch \
                                --minShapes=input:1x3x416x416 \
                                --optShapes=input:1x3x896x896 \
                                --maxShapes=input:1x3x896x896 \
                                --verbose \
                                --fp16 \
                                --device=0
  python object_detector_trt.py

Note that yolor_p6 have 4 detect layer, change maximum boxes in exec_backends/trt_loader.py

Convert to TensorRT with BatchedNMSPlugin

For faster end-to-end processing with GPU, we can intergrade BatchedNMSPlugin to Yolor model. First we must convert model to ONNX, then follow all the steps bellow:

1. Simplifier model

pip install onnx-simplifier
python3 -m onnxsim yolor_csp_x_star.onnx yolor_csp_x_star-sim.onnx --dynamic-input-shape --input-shape 1,3,640,640

2. Add post-process & plugin

python3 add_nms_plugins.py --model yolor_csp_x_star-sim.onnx

If you met IR version checking error, try to use torch==1.8.0 onnx==1.6.0 when convert original to ONNX, and then onnx==1.11.0 for this step This script does following stages:

  • Split current output tensor to bboxes & scores tensors, which are required inputs of batchedNMSDynamic plugins
  • Add post-processing to current model
  • Add plugin on top of post-processed model

3. Convert to TensorRT with Plugin

  /usr/src/tensorrt/bin/trtexec --onnx=yolor_csp_x_star-nms.onnx \
                                --saveEngine=yolor_csp_x_star.trt \
                                --explicitBatch \
                                --minShapes=input:1x3x416x416 \
                                --optShapes=input:1x3x896x896 \
                                --maxShapes=input:1x3x896x896 \
                                --verbose \
                                --device=0

4. Run demo

python3 object_detector_trt_nms.py

Citation

@article{wang2021you,
  title={You Only Learn One Representation: Unified Network for Multiple Tasks},
  author={Wang, Chien-Yao and Yeh, I-Hau and Liao, Hong-Yuan Mark},
  journal={arXiv preprint arXiv:2105.04206},
  year={2021}
}

Acknowledgements

Expand