Skip to content

Latest commit

 

History

History
 
 

AI-Model-Zoo

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Vitis AI Model Zoo

Introduction

This repository includes optimized deep learning models to speed up the deployment of deep learning inference on Xilinx™ platforms. These models cover different applications, including but not limited to ADAS/AD, medical, video surveillance, robotics, data center, etc. You can get started with these free pre-trained models to enjoy the benefits of deep learning acceleration.

Vitis AI 1.3 Model Zoo New Features!

1.New added 28 models. Total 92 models from different deep learning frameworks, Caffe, TensorFlow, TensorFlow 2 and PyTorch are supported in Vitis AI Model Zoo.

2.The variety of Model Zoo has been significantly improved for a wider range of applications.
For medical applications, we added CT image segmentation, medical robot instrument segmentation, Covid-19 chest radiograph segmentation and other reference models.
For autonomous driving and ADAS applications, we added 3D point cloud detection and point cloud segmentation models.

3.Provided accuracy evaluation and quantization scripts for all the released models.

4.Restructured model list show all versions of models that running on all supported Xilinx platforms in a clearer and more explicit way, and users could download the specific version they need for every model.

Model Details

The following table includes comprehensive information about each model, including application, framework, training and validation dataset, backbone, input size, computation as well as float and quantized precision.

Click here to view details
No. Application Model Name Framework Backbone Input Size OPS per image Training Set Val Set Float Accuracy (Top1/ Top5) Quantized Accuracy (Top1/Top5)
1 Image Classification resnet50 cf_resnet50_imagenet_224_224_7.7G_1.3 caffe resnet50 224*224 7.7G ImageNet Train ImageNet Val 0.7444/0.9185 0.7335/0.9130
2 Image Classifiction resnet18 cf_resnet18_imagenet_224_224_3.65G_1.3 caffe resnet18 224*224 3.65G ImageNet Train ImageNet Val 0.6844/0.8864 0.6699/0.8826
3 Image Classification Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G_1.3 caffe inception_v1 224*224 3.16G ImageNet Train ImageNet Val 0.7030/0.8971 0.6964/0.8942
4 Image Classification Inception_v2 cf_inceptionv2_imagenet_224_224_4G_1.3 caffe bn-inception 224*224 4G ImageNet Train ImageNet Val 0.7275/0.9111 0.7166/0.9029
5 Image Classification Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G_1.3 caffe inception_v3 299*299 11.4G ImageNet Train ImageNet Val 0.7701/0.9329 0.7622/0.9302
6 Image Classification Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G_1.3 caffe inception_v3 299*299 24.5G ImageNet Train ImageNet Val 0.7958/0.9470 0.7899/0.9444
7 Image Classification mobileNet_v2 cf_mobilenetv2_imagenet_224_224_0.59G_1.3 caffe MobileNet_v2 224*224 608M ImageNet Train ImageNet Val 0.6527/0.8643 0.6399/0.8540
8 Image Classifiction SqueezeNet cf_squeeze_imagenet_227_227_0.76G_1.3 caffe squeezenet 227*227 0.76G ImageNet Train ImageNet Val 0.5432/0.7824 0.5270/0.7816
9 ADAS Pedstrain Detection ssd_pedestrian_pruned_0.97 cf_ssdpedestrian_coco_360_640_0.97_5.9G_1.3 caffe VGG-bn-16 360*640 5.9G coco2014_train_person and crowndhuman coco2014_val_person 0.5903 0.5857
10 Object Detection refinedet_baseline cf_refinedet_coco_360_480_123G_1.3 caffe VGG-bn-16 360*480 123G coco2014_train_person coco2014_val_person 0.6928 0.7042
11 Object Detection refinedet_pruned_0.8 cf_refinedet_coco_360_480_0.8_25G_1.3 caffe VGG-bn-16 360*480 25G coco2014_train_person coco2014_val_person 0.6794 0.6790
12 Object Detection refinedet_pruned_0.92 cf_refinedet_coco_360_480_0.92_10.10G_1.3 caffe VGG-bn-16 360*480 10.10G coco2014_train_person coco2014_val_person 0.6489 0.6487
13 Object Detection refinedet_pruned_0.96 cf_refinedet_coco_360_480_0.96_5.08G_1.3 caffe VGG-bn-16 360*480 5.08G coco2014_train_person coco2014_val_person 0.6120 0.6112
14 ADAS Vehicle Detection ssd_adas_pruned_0.95 cf_ssdadas_bdd_360_480_0.95_6.3G_1.3 caffe VGG-16 360*480 6.3G bdd100k + private data bdd100k + private data 0.4207 0.4200
15 Traffic Detection ssd_traffic_pruned_0.9 cf_ssdtraffic_360_480_0.9_11.6G_1.3 caffe VGG-16 360*480 11.6G private data private data 0.5982 0.5921
16 ADAS Lane Detection VPGnet_pruned_0.99 cf_VPGnet_caltechlane_480_640_0.99_2.5G_1.3 caffe VGG 480*640 2.5G caltech lanes caltech lanes 0.8864 (F1-score) 0.8882 (F1-score)
17 Object Detection ssd_mobilnet_v2 cf_ssdmobilenetv2_bdd_360_480_6.57G_1.3 caffe MobileNet_v2 360*480 6.57G bdd100k train bdd100k val 0.3052 0.2752
18 ADAS Segmentation FPN cf_fpn_cityscapes_256_512_8.9G_1.3 caffe Google_v1_BN 256*512 8.9G Cityscapes gtFineTrain Cityscapes Val 0.5669 0.5607
19 Pose Estimation SP-net cf_SPnet_aichallenger_224_128_0.54G_1.3 caffe Google_v1_BN 224*128 548.6M ai_challenger ai_challenger 0.9000 (PCKh0.5) 0.8964 (PCKh0.5)
20 Pose Estimation Openpose_pruned_0.3 cf_openpose_aichallenger_368_368_0.3_189.7G_1.3 caffe VGG 368*368 189.7G ai_challenger ai_challenger 0.4507 (OKs) 0.4429 (Oks)
21 Face Detection densebox_320_320 cf_densebox_wider_320_320_0.49G_1.3 caffe VGG-16 320*320 0.49G wider_face FDDB 0.8833 0.8783
22 Face Detection densebox_360_640 cf_densebox_wider_360_640_1.11G_1.3 caffe VGG-16 360*640 1.11G wider_face FDDB 0.8931 0.8922
23 Face Recognition face_landmark cf_landmark_celeba_96_72_0.14G_1.3 caffe lenet 96*72 0.14G celebA processed helen 0.1952 (L2 loss) 0.1971 (L2 loss)
24 Re-identification reid cf_reid_market1501_160_80_0.95G_1.3 caffe resnet18 160*80 0.95G Market1501+CUHK03 Market1501 mAP:0.5660 Rank-1:0.7800 mAP:0.5590 Rank-1:0.7760
25 Detection+Segmentation multi-task cf_multitask_bdd_288_512_14.8G_1.3 caffe ssd 288*512 14.8G BDD100K+Cityscapes BDD100K+Cityscapes mAP:0.2228 mIOU:0.4088 mAP:0.2140 mIOU:0.4058
26 Object Detection yolov3_bdd dk_yolov3_bdd_288_512_53.7G_1.3 darknet darknet-53 288*512 53.7G bdd100k bdd100k 0.5058 0.4910
27 Object Detection yolov3_adas_prune_0.9 dk_yolov3_cityscapes_256_512_0.9_5.46G_1.3 darknet darknet-53 256*512 5.46G Cityscapes Cityscapes 0.5520 0.5300
28 Object Detection yolov3_voc dk_yolov3_voc_416_416_65.42G_1.3 darknet darknet-53 416*416 65.42G voc07+12_trainval voc07_test 0.8240 0.8130
29 Object Detection yolov2_voc dk_yolov2_voc_448_448_34G_1.3 darknet darknet-19 448*448 34G voc07+12_trainval voc07_test 0.7845 0.7740
30 Object Detection yolov2_voc_pruned_0.66 dk_yolov2_voc_448_448_0.66_11.56G_1.3 darknet darknet-19 448*448 11.56G voc07+12_trainval voc07_test 0.7700 0.7610
31 Object Detection yolov2_voc_pruned_0.71 dk_yolov2_voc_448_448_0.71_9.86G_1.3 darknet darknet-19 448*448 9.86G voc07+12_trainval voc07_test 0.7670 0.7540
32 Object Detection yolov2_voc_pruned_0.77 dk_yolov2_voc_448_448_0.77_7.82G_1.3 darknet darknet-19 448*448 7.82G voc07+12_trainval voc07_test 0.7576 0.7482
33 Face Recognition ResNet20-face cf_facerec-resnet20_112_96_3.5G_1.3 caffe resnet20 112*96 3.5G private data private data 0.9610 (1e-06) 0.9480 (1e-06)
34 Face Recognition ResNet64-face cf_facerec-resnet64_112_96_11G_1.3 caffe resnet64 112*96 11G private data private data 0.9830 (1e-06) 0.9820 (1e-06)
35 Medical Segmentation FPN_Res18 cf_FPN-resnet18_EDD_320_320_45.3G_1.3 caffe resnet18 320*320 45.30G EDD_seg EDD_seg mean dice=0.8203 mean jaccard=0.7925 F2-score=0.8075 mean dice=0.8055 mean jaccard=0.7772 F2- score=0.7925
36 Plate Detection plate_detection cf_plate-detection_320_320_0.49G_1.3 caffe modify_vgg 320*320 0.49G private data private data Recall1: 0.9660 Recall1: 0.9650
37 Plate Recognition plate_recognition cf_plate-recognition_96_288_1.75G_1.3 caffe Google_v1 96*288 1.75G private data private data plate number:99.51% plate color:100% plate number:99.51% plate color:100%
38 Face Detection retinaface cf_retinaface_wider_360_640_1.11G_1.3 caffe MobileNet_v1 0.25 360*640 1.11G wideface and FDDB widerface and FDDB 0.9140@fp=100 0.8940@fp=100
39 Face Quality face_quality cf_face-quality_80_60_61.68M_1.3 caffe lenet 80*60 61.68M private data private data 12.5481 (L1 loss) 12.6863 (L1 loss)
40 Robot Instrument Segmentation FPN-R18(light-weight) cf_FPN-resnet18_Endov_240_320_13.75G_1.3 caffe resnet18 240*320 13.75G EndoVis'15 EndoVis'15 Dice=0.8050 Jaccard=0.7270 Dice=0.8010 Jaccard=0.7230
41 Pose Estimation Hourglass cf_hourglass-pe_mpii_256_256_10.2G_1.3 caffe resnet 256*256 10.20G MPII Human Pose MPII Human Pose 0.8718 (PCKh0.5) 0.8661 (PCKh0.5)
42 2D Detection tiny-yolov3 dk_tiny-yolov3_416_416_5.46G_1.3 darknet darknet-19 416*416 5.46G private data private data 0.9739 0.9650
43 2D Detection yolov4 dk_yolov4_coco_416_416_60.1G_1.3 darknet yolov4 416*416 60.10G coco coco2014-5k 0.3950 0.3730
44 Image Classifiction Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G_1.3 tensorflow inception 299*299 26.35G ImageNet Train ImageNet Val 0.8037 0.7946
45 Image Classifiction Inception_v1 tf_inceptionv1_imagenet_224_224_3G_1.3 tensorflow inception 224*224 3G ImageNet Train ImageNet Val 0.6976 0.6794
46 Image Classifiction Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G_1.3 tensorflow inception 299*299 11.45G ImageNet Train ImageNet Val 0.7798 0.7607
47 Image Classifiction Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G_1.3 tensorflow inception 299*299 24.55G ImageNet Train ImageNet Val 0.8018 0.7928
48 Image Classifiction Mobilenet_v1 tf_mobilenetv1_0.25_imagenet_128_128_27M_1.3 tensorflow mobilenet 128*128 27.15M ImageNet Train ImageNet Val 0.4144 0.3464
49 Image Classifiction Mobilenet_v1 tf_mobilenetv1_0.5_imagenet_160_160_150M_1.3 tensorflow mobilenet 160*160 150.07M ImageNet Train ImageNet Val 0.5903 0.5195
50 Image Classifiction Mobilenet_v1 tf_mobilenetv1_1.0_imagenet_224_224_1.14G_1.3 tensorflow mobilenet 224*224 1.14G ImageNet Train ImageNet Val 0.7102 0.6779
51 Image Classifiction Mobilenet_v2 tf_mobilenetv2_1.0_imagenet_224_224_602M_1.3 tensorflow mobilenet 224*224 0.59G ImageNet Train ImageNet Val 0.7013 0.6767
52 Image Classifiction Mobilenet_v2 tf_mobilenetv2_1.4_imagenet_224_224_1.16G_1.3 tensorflow mobilenet 224*224 1.16G ImageNet Train ImageNet Val 0.7411 0.7194
53 Image Classifiction resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G_1.3 tensorflow resnetv1 224*224 6.97G ImageNet Train ImageNet Val 0.7520 0.7478
54 Image Classifiction resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G_1.3 tensorflow resnetv1 224*224 14.4G ImageNet Train ImageNet Val 0.7640 0.7560
55 Image Classifiction resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G_1.3 tensorflow resnetv1 224*224 21.83G ImageNet Train ImageNet Val 0.7681 0.7463
56 Image Classifiction vgg_16 tf_vgg16_imagenet_224_224_30.96G_1.3 tensorflow vgg 224*224 30.96G ImageNet Train ImageNet Val 0.7089 0.7069
57 Image Classifiction vgg_19 tf_vgg19_imagenet_224_224_39.28G_1.3 tensorflow vgg 224*224 39.28G ImageNet Train ImageNet Val 0.7100 0.7026
58 Object Detection ssd_mobilenet_v1 tf_ssdmobilenetv1_coco_300_300_2.47G_1.3 tensorflow mobilenet 300*300 2.47G coco2017 coco2014 minival 0.2080 0.2100
59 Object Detection ssd_mobilenet_v2 tf_ssdmobilenetv2_coco_300_300_3.75G_1.3 tensorflow mobilenet 300*300 3.75G coco2017 coco2014 minival 0.2150 0.2110
60 Object Detection ssd_resnet50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G_1.3 tensorflow resnet50 300*300 178.4G coco2017 coco2014 minival 0.3010 0.2900
61 Object Detection yolov3_voc tf_yolov3_voc_416_416_65.63G_1.3 tensorflow darknet-53 416*416 65.63G voc07+12_trainval voc07_test 0.7846 0.7744
62 Object Detection mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G_1.3 tensorflow resnet34 1200*1200 433G coco2017 coco2017 0.2250 0.2150
63 Image Classifiction Inception_v2 tf_inceptionv2_imagenet_224_224_3.88G_1.3 tensorflow inception 224*224 3.88G ImageNet Train ImageNet Val 0.7399 07331
64 Image Classifiction resnet_v2_50 tf_resnetv2_50_imagenet_299_299_13.1G_1.3 tensorflow resnetv2 299*299 13.1G ImageNet Train ImageNet Val 0.7559 0.7445
65 Image Classifiction resnet_v2_101 tf_resnetv2_101_imagenet_299_299_26.78G_1.3 tensorflow resnetv2 299*299 26.78G ImageNet Train ImageNet Val 0.7695 0.7506
66 Image Classifiction resnet_v2_152 tf_resnetv2_152_imagenet_299_299_40.47G_1.3 tensorflow resnetv2 299*299 40.47G ImageNet Train ImageNet Val 0.7779 0.7432
67 Object Detection ssdlite_mobilenetv2 tf_ssdlite_mobilenetv2_coco_300_300_1.5G_1.3 tensorflow mobilenet 300*300 1.5G coco2017 coco2014 minival 0.2170 0.2090
68 Object Detection ssd_inceptionv2 tf_ssdinceptionv2_coco_300_300_9.62G_1.3 tensorflow inception 300*300 9.62G coco2017 coco2014 minival 0.2390 0.2360
69 Segmentation Mobilenet_v2 tf_mobilenetv2_cityscapes_1024_2048_132.74G_1.3 tensorflow mobilenet 1024*2048 132.74G Cityscapes Cityscapes 0.6263 0.4578
70 Image Classifiction mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G_1.3 tensorflow resnet50 224*224 8.19G ImageNet Train ImageNet Val 0.7652 0.7606
71 General Detection refinedet tf_refinedet_VOC_320_320_81.9G_1.3 tensorflow vgg_bn 320*320 81.9G voc07+12_trainval voc07_test 0.8015 0.7999
72 Image Classifiction mobilenet_edge_1.0 tf_mobilenetEdge1.0_imagenet_224_224_990M_1.3 tensorflow mobilenetEdge1.0 224*224 990M ImageNet Train ImageNet Val 0.7227 0.6775
73 Image Classifiction mobilenet_edge_0.75 tf_mobilenetEdge0.75_imagenet_224_224_624M_1.3 tensorflow mobilenetEdge0.75 224*224 624M ImageNet Train ImageNet Val 0.7201 0.6489
74 Medical Detection refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G_1.3 tensorflow pruned_vgg16 320*320 9.83G EDD EDD 0.7839 0.8014
75 Image Classifiction resnet50 tf2_resnet50_imagenet_224_224_7.76G_1.3 tensorflow resnet50 224*224 7.76G ImageNet Train ImageNet Val 0.7513 0.7423
76 Image Classifiction Mobilenet_v1 tf2_mobilenetv1_imagenet_224_224_1.15G_1.3 tensorflow mobilenet 224*224 1.15G ImageNet Train ImageNet Val 0.7005 0.5603
77 Image Classifiction Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G_1.3 tensorflow inception 299*299 11.5G ImageNet Train ImageNet Val 0.7753 0.7694
78 Medical Segmentation 2d-unet tf2_2d-unet_nuclei_128_128_5.31G_1.3 tensorflow unet 128*128 5.31G Nuclei Cell Nuclei Cell 0.3968 0.3968
79 Semantic Segmentation ERFNet tf2_erfnet_cityscapes_512_1024_54G_1.3 tensorflow ERFNet 512*1024 54G Cityscapes Cityscapes 0.5298 0.5167
80 Segmentation ENet pt_ENet_cityscapes_512_1024_8.6G_1.3 pytorch ENet 512*1024 8.6G Cityscapes Cityscapes 0.6442 0.6315
81 Segmentation SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G_1.3 pytorch FPN-Resnet18 256*512 10G Cityscapes Cityscapes 0.6290 0.6230
82 Face Recognition ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G_1.3 pytorch resnet20 112*96 3.5G mixed mixed 0.9955 0.9947
83 Face Quality face quality pt_face-quality_80_60_61.68M_1.3 pytorch lenet 80*60 61.68M private data private data 0.1233 0.1258
84 Multi Task MT-resnet18 pt_MT-resnet18_mixed_320_512_13.65G_1.3 pytorch resnet18 320*512 13.65G mixed mixed mAP:0.3951 mIOU:0.4403 mAP:0.3841 mIOU:0.4271
85 Face ReID face_reid_large pt_facereid-large_96_96_515M_1.3 pytorch resnet18 96*96 515M private data private data mAP:0.7940 Rank1:0.9550 mAP:0.7900 Rank1:0.9530
86 Face ReID face_reid_small pt_facereid-small_80_80_90M_1.3 pytorch resnet_small 80*80 90M private data private data mAP:0.5600 Rank1:0.8650 mAP:0.5590 Rank1:0.8650
87 Re-identification person_reid pt_personreid-res50_market1501_256_128_5.4G_1.3 pytorch resnet50 256*128 4.2G market1501 market1501 mAP:0.8540 Rank1:0.9410 mAP:0.8380 Rank1:0.9370
88 Re-identification person_reid pt_personreid-res18_market1501_176_80_1.1G_1.3 pytorch resnet18 176*80 1.1G market1501 market1501 mAP:0.7530 Rank1:0.8980 mAP:0.7460 Rank1:0.8930
89 3D Detection pointpillars pt_pointpillars_kitti_12000_100_10.8G_1.3 pytorch PointNet 12000*100 10.8G kitti kitti Car 3D [email protected](easy, moderate, hard) 90.79, 89.66, 88.78 Car 3D [email protected](easy, moderate, hard) 90.75, 87.04, 83.44
90 3D point cloud segmentation salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G_1.3 pytorch pruned resnet50 64*2048 20.4G semantic-kitti semantic-kitti Acc avg 0.8860 IoU avg 0.5100 Acc avg 0.8350 IoU avg 0.4540
91 Covid-19 segmentation FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G_1.3 pytorch resnet18 352*352 22.7G COVID19-seg COVID19-seg 2-classes Dice:0.8588 3-classes mIoU:0.5989 2-classes Dice:0.8547 3-classes mIoU:0.5957
92 CT lung segmentation 2d-unet pt_unet_chaos-CT_512_512_23.3G_1.3 pytorch unet 512*512 23.3G Chaos-CT Chaos-CT Dice:0.9758 Dice:0.9747

Naming Rules

Model name: F_M_(D)_H_W_(P)_C_V

  • F specifies training framework: cf is Caffe, tf is Tensorflow, tf2 is Tensorflow 2, dk is Darknet, pt is PyTorch
  • M specifies the model
  • D specifies the dataset. It is optional depending on whether the dataset is public or private
  • H specifies the height of input data
  • W specifies the width of input data
  • P specifies the pruning ratio, it means how much computation is reduced. It is optional depending on whether the model is pruned or not
  • C specifies the computation of the model: how many Gops per image
  • V specifies the version of Vitis AI

For example, cf_refinedet_coco_360_480_0.8_25G_1.3 is a RefineDet model trained with Caffe using COCO dataset, input data size is 360*480, 80% pruned, the computation per image is 25Gops and Vitis AI version is 1.3.

caffe-xilinx

This is a custom distribution of caffe. Please use caffe-xilinx to test and finetune the caffe models listed in this page.

Model Download

Please visit model-list in this page. You will get downloadlink and MD5 of all the released models, including pre-compiled models running on different platforms.

Model Directory Structure

Download and extract the model archive to your working area on the local hard disk. For details on the various models, their download link and MD5 checksum for the zip file of each model, see model-list.

Caffe Model Directory Structure

For a caffe model, you should see the following directory structure:

├── code                            # Contains test , training and quantization scripts.
│                                     
│                                   
├── readme.md                       # Contains the environment requirements, data preprocess and model information.
│                                     Refer this to know that how to test and train the model with scripts.
│                                        
├── data                            # Contains the dataset that used for model test and training.
│                                     When test or training scripts run successfully, dataset will be automatically placed in it.
│                                     In some cases, you may also need to manually place your own dataset here.
│                                                      
├── quantized  
│    │    
│    ├── deploy.caffemodel                # Quantized weights, the output of vai_q_caffe without modification.
│    ├── deploy.prototxt                  # Quantized prototxt, the output of vai_q_caffe without modification.
│    ├── quantize_test.prototxt           # Used to run evaluation with quantize_train_test.caffemodel. 
│    │                                      Some models don't have this file if they are converted from Darknet (Yolov2, Yolov3),   
│    │                                      Pytorch (ReID) or there is no Caffe Test (Densebox). 
│    │
│    ├── quantize_train_test.caffemodel   # Quantized weights can be used for quantized-point training and evaluation.       
│    ├── quantize_train_test.prototxt     # Used for quantized-point training and testing with quantize_train_test.caffemodel 
│    │                                      on GPU when datalayer modified to user's data path.         
│    └── DNNC
│         └──deploy.prototxt              # Quantized prototxt for dnnc. It's the deploy.prototxt without option 'keep fixed neuron'.
│                                           
│                                                 
└── float                           
     ├── trainval.caffemodel              # Trained float-point weights.
     ├── test.prototxt                    # Used to run evaluation with python test codes released in near future.    
     └── trainval.prorotxt                # Used for training and testing with caffe train/test command 
                                            when datalayer modified to user's data path.Some models don't
                                            have this file if they are converted from Darknet (Yolov2, Yolov3),
                                            Pytorch (ReID) or there is no Caffe Test (Densebox).              

Tensorflow Model Directory Structure

For a Tensorflow model, you should see the following directory structure:

├── code                            # Contains test code which can run demo and evaluate model performance. 
│                          
│
├── readme.md                       # Contains the environment requirements, data preprocess and model information.
│                                     Refer this to know that how to test the model with scripts.
│
├── data                            # Contains the dataset that used for model test and training.
│                                     When test or training scripts run successfully, dataset will be automatically placed in it.
│
├── quantized                          
│   ├── deploy.model.pb             # Quantized model for the compiler (extended Tensorflow format).
│   └── quantize_eval_model.pb      # Quantized model for evaluation.
│
└── float                             
    └── frozen.pb                   # Float-point frozen model, the input to the `vai_q_tensorflow`.
                                      The pb name of different models may be different.

Pytorch Model Directory Structure

For a Pytorch model, you should see the following directory structure:

├── code                            # Contains test and training code.  
│                                                        
│                                   
├── readme.md                       # Contains the environment requirements, data preprocess and model information.
│                                     Refer this to know that how to test and train the model with scripts.
│                                        
├── data                            # Contains the dataset that used for model test and training.
│                                     When test or training scripts run successfully, dataset will be automatically placed in it.
│
├── quantized                          
│   ├── bias_corr.pth               # Quantized model.
│   ├── quant_info.json             # Quantization steps of tensors got. Please keep it for evaluation of quantized model.
│   ├── _int.py                     # Converted vai_q_pytorch format model.
│   └── _int.xmodel                 # Deployed model. The name of different models may be different.
│
│
└── float                           
    └── _int.pth                    # Trained float-point model. The pth name of different models may be different.
                                      Path and name in test scripts could be modified according to actual situation.

Note: For more information on vai_q_caffe and vai_q_tensorflow, see the Vitis AI User Guide.

Model Performance

All the models in the Model Zoo have been deployed on Xilinx hardware with Vitis AI and Vitis AI Library. The performance number including end-to-end throughput and latency for each model on various boards with different DPU configurations are listed in the following sections.

For more information about DPU, see DPU IP Product Guide.

Note: The model performance number listed in the following sections is generated with Vitis AI v1.3 and Vitis AI Lirary v1.3. For different boards, different DPU configurations are used. Vitis AI and Vitis AI Library can be downloaded for free from Vitis AI Github and Vitis AI Library Github. We will continue to improve the performance with Vitis AI. The performance number reported here is subject to change in the near future.

Performance on ZCU102 (0432055-04)

This version of ZCU102 is out of stock. The performance number shown below was measured with the previous AI SDK v2.0.4. Now this form has stopped updating.

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the ZCU102 (0432055-04) board with a 3 * B4096 @ 287MHz V1.4.0 DPU configuration:

No. Model Name E2E latency (ms) Thread num =1 E2E throughput -fps(Single Thread) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 12.85 77.8 179.3
2 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 5.47 182.683 485.533
3 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 6.76 147.933 373.267
4 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 17 58.8333 155.4
5 mobileNet_v2 cf_mobilenetv2_imagenet_224_224_0.59G 4.09 244.617 638.067
6 tf_resnet50 tf_resnet50_imagenet_224_224_6.97G 11.94 83.7833 191.417
7 tf_inception_v1 tf_inceptionv1_imagenet_224_224_3G 6.72 148.867 358.283
8 tf_mobilenet_v2 tf_mobilenetv2_imagenet_224_224_1.17G 5.46 183.117 458.65
9 ssd_adas_pruned_0.95 cf_ssdadas_bdd_360_480_0.95_6.3G 11.33 88.2667 320.5
10 ssd_pedestrian_pruned_0.97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 12.96 77.1833 314.717
11 ssd_traffic_pruned_0.9 cf_ssdtraffic_360_480_0.9_11.6G 17.49 57.1833 218.183
12 ssd_mobilnet_v2 cf_ssdmobilenetv2_bdd_360_480_6.57G 24.21 41.3 141.233
13 tf_ssd_voc tf_ssd_voc_300_300_64.81G 69.28 14.4333 46.7833
14 densebox_320_320 cf_densebox_wider_320_320_0.49G 2.43 412.183 1416.63
15 densebox_360_640 cf_densebox_wider_360_640_1.11G 5.01 199.717 719.75
16 yolov3_adas_prune_0.9 dk_yolov3_cityscapes_256_512_0.9_5.46G 11.09 90.1667 259.65
17 yolov3_voc dk_yolov3_voc_416_416_65.42G 70.51 14.1833 44.4
18 tf_yolov3_voc tf_yolov3_voc_416_416_65.63G 70.75 14.1333 44.0167
19 refinedet_pruned_0.8 cf_refinedet_coco_360_480_0.8_25G 29.91 33.4333 109.067
20 refinedet_pruned_0.92 cf_refinedet_coco_360_480_0.92_10.10G 15.39 64.9667 216.317
21 refinedet_pruned_0.96 cf_refinedet_coco_360_480_0.96_5.08G 11.04 90.5833 312
22 FPN cf_fpn_cityscapes_256_512_8.9G 16.58 60.3 203.867
23 VPGnet_pruned_0.99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 9.44 105.9 424.667
24 SP-net cf_SPnet_aichallenger_224_128_0.54G 1.73 579.067 1620.67
25 Openpose_pruned_0.3 cf_openpose_aichallenger_368_368_0.3_189.7G 279.07 3.58333 38.5
26 yolov2_voc dk_yolov2_voc_448_448_34G 39.76 25.15 86.35
27 yolov2_voc_pruned_0.66 dk_yolov2_voc_448_448_0.66_11.56G 18.42 54.2833 211.217
28 yolov2_voc_pruned_0.71 dk_yolov2_voc_448_448_0.71_9.86G 16.42 60.9167 242.433
29 yolov2_voc_pruned_0.77 dk_yolov2_voc_448_448_0.77_7.82G 14.46 69.1667 286.733
30 Inception-v4 cf_inceptionv4_imagenet_299_299_24.5G 34.25 29.2 84.25
31 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 3.6 277.65 1080.77
32 face_landmark cf_landmark_celeba_96_72_0.14G 1.13 885.033 1623.3
33 reid cf_reid_marketcuhk_160_80_0.95G 2.67 375 773.533
34 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 73.89 13.5333 42.8833
35 tf_mobilenet_v1 tf_mobilenetv1_imagenet_224_224_1.14G 3.2 312.067 875.967
36 resnet18 cf_resnet18_imagenet_224_224_3.65G 5.1 195.95 524.433
37 resnet18_wide tf_resnet18_imagenet_224_224_28G 33.28 30.05 83.4167

Performance on ZCU102 (0432055-05)

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the ZCU102 (0432055-05) board with a 3 * B4096 @ 281MHz V1.4.1 DPU configuration:

No. Model Name E2E latency (ms) Thread num =1 E2E throughput -fps(Single Thread) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 14.1 70.8 164.7
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 5.55 180.2 469.1
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 6.12 163.2 428.1
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 8.13 122.9 286.7
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 17.47 57.2 132.3
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 35.26 28.3 68.1
7 Mobilenet_v2 cf_mobilenetv2_imagenet_224_224_0.59G 4.08 245.1 712
8 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 3.9 256.1 1035.4
9 ssd_pedestrian_pruned_0_97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 13.37 74.7 275
10 refinedet_baseline cf_refinedet_coco_360_480_123G 118.09 8.5 24.4
11 refinedet_pruned_0_8 cf_refinedet_coco_360_480_0.8_25G 31.15 32.1 98.7
12 refinedet_pruned_0_92 cf_refinedet_coco_360_480_0.92_10.10G 16.27 61.4 194
13 refinedet_pruned_0_96 cf_refinedet_coco_360_480_0.96_5.08G 11.78 84.9 272.4
14 ssd_adas_pruned_0_95 cf_ssdadas_bdd_360_480_0.95_6.3G 11.98 83.4 285.5
15 ssd_traffic_pruned_0_9 cf_ssdtraffic_360_480_0.9_11.6G 18.76 53.3 195
16 VPGnet_pruned_0_99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 11.75 85 250.4
17 ssd_mobilenet_v2 cf_ssdmobilenetv2_bdd_360_480_6.57G 25.7 38.9 109.2
18 FPN cf_fpn_cityscapes_256_512_8.9G 30.62 32.6 144.7
19 SP_net cf_SPnet_aichallenger_224_128_0.54G 2.24 445.4 1227.1
20 Openpose_pruned_0_3 cf_openpose_aichallenger_368_368_0.3_189.7G 287.88 3.5 14.9
21 densebox_320_320 cf_densebox_wider_320_320_0.49G 2.73 365.9 1126.9
22 densebox_640_360 cf_densebox_wider_360_640_1.11G 5.58 179.2 503.9
23 face_landmark cf_landmark_celeba_96_72_0.14G 1.21 823.9 1553.6
24 reid cf_reid_market1501_160_80_0.95G 2.88 346.6 690.1
25 multi_task cf_multitask_bdd_288_512_14.8G 28.93 34.5 117.1
26 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 82.2 12.1 32.3
27 yolov3_adas_pruned_0_9 dk_yolov3_cityscapes_256_512_0.9_5.46G 11.73 85.2 251.3
28 yolov3_voc dk_yolov3_voc_416_416_65.42G 80.13 12.5 32.6
29 yolov2_voc dk_yolov2_voc_448_448_34G 38.94 25.7 68
30 yolov2_voc_pruned_0_66 dk_yolov2_voc_448_448_0.66_11.56G 16 62.5 189
31 yolov2_voc_pruned_0_71 dk_yolov2_voc_448_448_0.71_9.86G 13.9 71.9 221
32 yolov2_voc_pruned_0_77 dk_yolov2_voc_448_448_0.77_7.82G 11.92 83.9 265.2
33 ResNet20-face cf_facerec-resnet20_112_96_3.5G 6.11 163.5 329.1
34 ResNet64-face cf_facerec-resnet64_112_96_11G 14.19 70.4 176.4
35 FPN_Res18_segmentation cf_FPN-resnet18_EDD_320_320_45.3G 85.83 11.6 42.1
36 plate detection cf_plate-detection_320_320_0.49G 2.02 493.8 1996
37 plate recognition cf_plate-recognition_96_288_1.75G 8.14 122.8 239.9
38 retinaface cf_retinaface_wider_360_640_1.11G 8.01 124.8 569.2
39 face_quality cf_face-quality_80_60_61.68M 0.5 2000.4 6721.9
40 FPN-R18(light-weight) cf_FPN-resnet18_Endov_240_320_13.75G 30.26 33 152.4
41 Hourglass cf_hourglass-pe_mpii_256_256_10.2G 20.69 48.3 113.4
42 tiny-yolov3 dk_tiny-yolov3_416_416_5.46G 8.65 115.5 383.6
43 yolov4 dk_yolov4_coco_416_416_60.1G 77.69 12.8 32.8
44 Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G 45.64 21.9 49.2
45 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 6 166.7 428.6
46 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 17.6 56.8 130.8
47 Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G 35.27 28.3 68
48 Mobilenet_v1 tf_mobilenetv1_0.25_imagenet_128_128_27.15M 0.94 1062.6 3991.2
49 Mobilenet_v1 tf_mobilenetv1_0.5_imagenet_160_160_150.07M 1.45 689.5 2624.4
50 Mobilenet_v1 tf_mobilenetv1_1.0_imagenet_224_224_1.14G 3.48 287.4 892.8
51 Mobilenet_v2 tf_mobilenetv2_1.0_imagenet_224_224_0.59G 4.15 240.7 675.7
52 Mobilenet_v2 tf_mobilenetv2_1.4_imagenet_224_224_1.16G 5.67 176.2 455.5
53 resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G 12.89 77.5 178
54 resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G 23.95 41.7 103.4
55 resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G 34.89 28.6 72.1
56 vgg_16 tf_vgg16_imagenet_224_224_30.96G 50.06 19.9 40.5
57 vgg_19 tf_vgg19_imagenet_224_224_39.28G 58.01 17.2 36
58 ssd_mobilenet_v1 tf_ssdmobilenetv1_coco_300_300_2.47G 10.51 95.1 305.5
59 ssd_mobilenet_v2 tf_ssdmobilenetv2_coco_300_300_3.75G 12.69 78.7 206.2
60 ssd_resnet_50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G 351.33 2.8 5
61 yolov3_voc tf_yolov3_voc_416_416_65.63G 77.44 12.9 33.4
62 mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G 530.72 1.9 6.9
63 Inception_v2 tf_inceptionv2_imagenet_224_224_3.88G 11.51 86.8 220.8
64 resnet_v2_50 tf_resnetv2_50_imagenet_299_299_13.1G_1.3 29.38 34 83.3
65 resnet_v2_101 tf_resnetv2_101_imagenet_299_299_26.78G 50.45 19.8 48.5
66 resnet_v2_152 tf_resnetv2_152_imagenet_299_299_40.47G 71.38 14 33.7
67 ssdlite_mobilenetv2 tf_ssdlite_mobilenetv2_coco_300_300_1.5G 10 99.9 292.8
68 ssd_inceptionv2 tf_ssdinceptionv2_coco_300_300_9.62G_1.3 26.4 37.8 97.5
69 Mobilenet_v2 tf_mobilenetv2_cityscapes_1024_2048_132.74G 636.19 1.6 4.4
70 mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G 14.34 69.6 160.6
71 refinedet tf_refinedet_VOC_320_320_81.9G 90.67 11 33.6
72 mobilenet_edge_1.0 tf_mobilenetEdge1.0_imagenet_224_224_990M_1.3 5.17 193.4 525.2
73 mobilenet_edge_0.75 tf_mobilenetEdge0.75_imagenet_224_224_624M 4.31 231.6 675.8
74 refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G 15.51 64.4 220.3
75 resnet50 tf2_resnet50_imagenet_224_224_7.76G 14.52 68.8 165.2
76 Mobilenet_v1 tf2_mobilenetv1_imagenet_224_224_1.15G 3.52 284.1 879.6
77 Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G 17.94 55.7 130.4
78 2d-unet tf2_2d-unet_nuclei_128_128_5.31G 6.6 151.4 381.2
79 ERFNet tf2_erfnet_cityscapes_512_1024_54G 149.67 6.7 23
80 ENet pt_ENet_cityscapes_512_1024_8.6G 113.44 8.8 37.9
81 SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G 30.38 32.9 160.1
82 ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G 6.12 163.2 328.2
83 face quality pt_face-quality_80_60_61.68M 0.47 2140.1 6887.7
84 MT-resnet18 pt_MT-resnet18_mixed_320_512_13.65G NA NA NA
85 face_reid_large pt_facereid-large_96_96_515M 1.19 837.2 2164.7
86 face_reid_small pt_facereid-small_80_80_90M 0.54 1833.4 5948.5
87 person_reid pt_personreid-res50_market1501_256_128_5.4G 10.36 96.4 223.5
88 person_reid pt_personreid-res18_market1501_176_80_1.1G 2.86 349.2 681.6
89 pointpillars pt_pointpillars_kitti_12000_100_10.8G 53.7 18.6 49.8
90 salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G 188.48 5.3 19.7
91 FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G 28.32 35.3 104.1
92 2d-unet pt_unet_chaos-CT_512_512_23.3G 48.35 20.7 64.8
93 Inception_v3 torchvision_inception_v3 17.42 57.4 132.6
94 SqueezeNet torchvision_squeezenet 4.87 205.2 766.9
95 resnet50 torchvision_resnet50 14.92 67 158.6

Performance on ZCU104

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the ZCU104 board with a 2 * B4096 @ 300MHz V1.4.1 DPU configuration:

No. Model Name E2E latency (ms) Thread num =1 E2E throughput -fps(Single Thread) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 13.29 75.1 144.7
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 5.27 189.7 409.7
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 5.81 172 368
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 7.67 130.2 260.7
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 16.46 60.7 119
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 33.09 30.2 58.7
7 Mobilenet_v2 cf_mobilenetv2_imagenet_224_224_0.59G 3.92 255.2 589.3
8 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 3.84 260.2 925.7
9 ssd_pedestrian_pruned_0_97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 12.87 77.6 209.5
10 refinedet_baseline cf_refinedet_coco_360_480_123G 110.92 9 18.3
11 refinedet_pruned_0_8 cf_refinedet_coco_360_480_0.8_25G 29.46 33.9 73.4
12 refinedet_pruned_0_92 cf_refinedet_coco_360_480_0.92_10.10G 15.5 64.5 149.7
13 refinedet_pruned_0_96 cf_refinedet_coco_360_480_0.96_5.08G 11.26 88.7 217.5
14 ssd_adas_pruned_0_95 cf_ssdadas_bdd_360_480_0.95_6.3G 11.49 87 227.7
15 ssd_traffic_pruned_0_9 cf_ssdtraffic_360_480_0.9_11.6G 17.97 55.6 145.3
16 VPGnet_pruned_0_99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 11.36 88 242.5
17 ssd_mobilenet_v2 cf_ssdmobilenetv2_bdd_360_480_6.57G 39.53 25.3 96.3
18 FPN cf_fpn_cityscapes_256_512_8.9G 29.91 33.4 126.8
19 SP_net cf_SPnet_aichallenger_224_128_0.54G 2.16 462.8 708.4
20 Openpose_pruned_0_3 cf_openpose_aichallenger_368_368_0.3_189.7G 274.42 3.6 10.9
21 densebox_320_320 cf_densebox_wider_320_320_0.49G 2.67 374.4 1132.2
22 densebox_640_360 cf_densebox_wider_360_640_1.11G 5.4 185 535.5
23 face_landmark cf_landmark_celeba_96_72_0.14G 1.15 866.6 1593.6
24 reid cf_reid_market1501_160_80_0.95G 2.72 367.4 696.6
25 multi_task cf_multitask_bdd_288_512_14.8G 27.89 35.8 104.7
26 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 77.18 12.9 26.3
27 yolov3_adas_pruned_0_9 dk_yolov3_cityscapes_256_512_0.9_5.46G 11.27 88.7 224.8
28 yolov3_voc dk_yolov3_voc_416_416_65.42G 75.25 13.3 26.6
29 yolov2_voc dk_yolov2_voc_448_448_34G 36.7 27.2 56.1
30 yolov2_voc_pruned_0_66 dk_yolov2_voc_448_448_0.66_11.56G 15.2 65.8 153.6
31 yolov2_voc_pruned_0_71 dk_yolov2_voc_448_448_0.71_9.86G 13.23 75.6 181.1
32 yolov2_voc_pruned_0_77 dk_yolov2_voc_448_448_0.77_7.82G 11.4 87.7 217.5
33 ResNet20-face cf_facerec-resnet20_112_96_3.5G 5.74 174.2 308.3
34 ResNet64-face cf_facerec-resnet64_112_96_11G 13.32 75 144.7
35 FPN_Res18_segmentation cf_FPN-resnet18_EDD_320_320_45.3G 81.71 12.2 31.5
36 plate detection cf_plate-detection_320_320_0.49G 2 500 1862.4
37 plate recognition cf_plate-recognition_96_288_1.75G 7.73 129.2 376.7
38 retinaface cf_retinaface_wider_360_640_1.11G 7.89 126.7 474.8
39 face_quality cf_face-quality_80_60_61.68M 0.48 2060 6608.3
40 FPN-R18(light-weight) cf_FPN-resnet18_Endov_240_320_13.75G 29.38 34 129.3
41 Hourglass cf_hourglass-pe_mpii_256_256_10.2G 19.59 51 106
42 tiny-yolov3 dk_tiny-yolov3_416_416_5.46G 8.35 119.7 324.9
43 yolov4 dk_yolov4_coco_416_416_60.1G 73.22 13.6 28.1
44 Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G 42.86 23.3 43.7
45 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 5.69 175.8 374.6
46 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 16.6 60.2 117.9
47 Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G 33.09 30.2 58.6
48 Mobilenet_v1 tf_mobilenetv1_0.25_imagenet_128_128_27.15M 0.92 1082.9 3781
49 Mobilenet_v1 tf_mobilenetv1_0.5_imagenet_160_160_150.07M 1.41 708.8 2135
50 Mobilenet_v1 tf_mobilenetv1_1.0_imagenet_224_224_1.14G 3.34 299 724.4
51 Mobilenet_v2 tf_mobilenetv2_1.0_imagenet_224_224_0.59G 3.98 251.1 573.5
52 Mobilenet_v2 tf_mobilenetv2_1.4_imagenet_224_224_1.16G 5.41 184.7 392.7
53 resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G 12.15 82.2 158.1
54 resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G 22.56 44.3 85.9
55 resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G 32.84 30.4 59
56 vgg_16 tf_vgg16_imagenet_224_224_30.96G 47 21.2 36.8
57 vgg_19 tf_vgg19_imagenet_224_224_39.28G 54.4 18.3 32.4
58 ssd_mobilenet_v1 tf_ssdmobilenetv1_coco_300_300_2.47G 10.08 99.1 258.1
59 ssd_mobilenet_v2 tf_ssdmobilenetv2_coco_300_300_3.75G 12.1 82.6 186.9
60 ssd_resnet_50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G 353.32 2.8 5.1
61 yolov3_voc tf_yolov3_voc_416_416_65.63G 72.74 13.7 27.5
62 mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G 561.31 1.8 5.1
63 Inception_v2 tf_inceptionv2_imagenet_224_224_3.88G 10.84 92.2 187
64 resnet_v2_50 tf_resnetv2_50_imagenet_299_299_13.1G_1.3 27.89 35.8 46.4
65 resnet_v2_101 tf_resnetv2_101_imagenet_299_299_26.78G 47.67 20.9 24.2
66 resnet_v2_152 tf_resnetv2_152_imagenet_299_299_40.47G 67.31 14.8 16.4
67 ssdlite_mobilenetv2 tf_ssdlite_mobilenetv2_coco_300_300_1.5G 9.75 102.5 258.9
68 ssd_inceptionv2 tf_ssdinceptionv2_coco_300_300_9.62G_1.3 28.91 34.6 83.9
69 Mobilenet_v2 tf_mobilenetv2_cityscapes_1024_2048_132.74G 615.68 1.6 4.5
70 mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G 13.51 73.9 142.1
71 refinedet tf_refinedet_VOC_320_320_81.9G 93.44 10.7 25.4
72 mobilenet_edge_1.0 tf_mobilenetEdge1.0_imagenet_224_224_990M_1.3 4.94 202.4 445.1
73 mobilenet_edge_0.75 tf_mobilenetEdge0.75_imagenet_224_224_624M 4.13 242 556.5
74 refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G 14.77 67.7 164.9
75 resnet50 tf2_resnet50_imagenet_224_224_7.76G 13.68 73.1 142.6
76 Mobilenet_v1 tf2_mobilenetv1_imagenet_224_224_1.15G 3.36 297.4 719.6
77 Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G 16.9 59.1 116.4
78 2d-unet tf2_2d-unet_nuclei_128_128_5.31G 6.24 160.2 334.7
79 ERFNet tf2_erfnet_cityscapes_512_1024_54G 145.2 6.9 23.4
80 ENet pt_ENet_cityscapes_512_1024_8.6G 110.99 9 38.3
81 SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G 29.72 33.6 131.9
82 ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G 5.75 173.7 306.8
83 face quality pt_face-quality_80_60_61.68M 0.45 2202.5 6861
84 MT-resnet18 pt_MT-resnet18_mixed_320_512_13.65G NA NA NA
85 face_reid_large pt_facereid-large_96_96_515M 1.14 877.7 1966.5
86 face_reid_small pt_facereid-small_80_80_90M 0.53 1882.8 5511.7
87 person_reid pt_personreid-res50_market1501_256_128_5.4G 9.78 102.2 196.2
88 person_reid pt_personreid-res18_market1501_176_80_1.1G 2.69 370.9 682.9
89 pointpillars pt_pointpillars_kitti_12000_100_10.8G 52.48 19 47.9
90 salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G 189.89 5.3 19.9
91 FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G 26.72 37.4 78.8
92 2d-unet pt_unet_chaos-CT_512_512_23.3G 46.78 21.4 57.4
93 Inception_v3 torchvision_inception_v3 16.41 60.9 119.4
94 SqueezeNet torchvision_squeezenet 4.73 211.4 669.9
95 resnet50 torchvision_resnet50 14.04 71.2 138

Performance on U50

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U50 board with 6 DPUv3E kernels running at 300Mhz in Gen3x4:

No. Model Name Frequency(MHz) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 300 647.1
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 300 1484.2
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 300 1203.4
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 300 976.2
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 300 413.2
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 300 189.5
7 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 300 3092.7
8 ssd_pedestrian_pruned_0_97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 300 457.4
9 refinedet_baseline cf_refinedet_coco_360_480_123G 300 47.2
10 refinedet_pruned_0_8 cf_refinedet_coco_360_480_0.8_25G 300 168.8
11 refinedet_pruned_0_92 cf_refinedet_coco_360_480_0.92_10.10G 300 405.2
12 refinedet_pruned_0_96 cf_refinedet_coco_360_480_0.96_5.08G 300 571.9
13 ssd_adas_pruned_0_95 cf_ssdadas_bdd_360_480_0.95_6.3G 300 579.7
14 ssd_traffic_pruned_0_9 cf_ssdtraffic_360_480_0.9_11.6G 300 354.5
15 VPGnet_pruned_0_99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 300 395.8
16 FPN cf_fpn_cityscapes_256_512_8.9G 300 440.7
17 SP_net cf_SPnet_aichallenger_224_128_0.54G 300 1339.0
18 Openpose_pruned_0_3 cf_openpose_aichallenger_368_368_0.3_189.7G 300 28.4
19 densebox_320_320 cf_densebox_wider_320_320_0.49G 300 1874.1
20 densebox_640_360 cf_densebox_wider_360_640_1.11G 300 821.9
21 face_landmark cf_landmark_celeba_96_72_0.14G 300 10057.4
22 reid cf_reid_market1501_160_80_0.95G 300 4010.7
23 multi_task cf_multitask_bdd_288_512_14.8G 300 321.1
24 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 300 78.1
25 yolov3_adas_pruned_0_9 dk_yolov3_cityscapes_256_512_0.9_5.46G 300 675.8
26 yolov3_voc dk_yolov3_voc_416_416_65.42G 300 80.6
27 yolov2_voc dk_yolov2_voc_448_448_34G 300 163.9
28 yolov2_voc_pruned_0_66 dk_yolov2_voc_448_448_0.66_11.56G 300 413.1
29 yolov2_voc_pruned_0_71 dk_yolov2_voc_448_448_0.71_9.86G 300 484.1
30 yolov2_voc_pruned_0_77 dk_yolov2_voc_448_448_0.77_7.82G 300 589.7
31 ResNet20-face cf_facerec-resnet20_112_96_3.5G 300 1356.2
32 ResNet64-face cf_facerec-resnet64_112_96_11G 300 507.9
33 FPN_Res18_segmentation cf_FPN-resnet18_EDD_320_320_45.3G 300 103.2
34 plate detection cf_plate-detection_320_320_0.49G 300 4914.4
35 plate recognition cf_plate-recognition_96_288_1.75G 300 834.0
36 face_quality cf_face-quality_80_60_61.68M 300 16238.6
37 tiny-yolov3 dk_tiny-yolov3_416_416_5.46G 300 874.7
38 yolov4 dk_yolov4_coco_416_416_60.1G 300 84.1
39 Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G 300 186.4
40 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 300 1240.5
41 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 300 413.9
42 Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G 300 189.4
43 resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G 300 712.2
44 resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G 300 367.6
45 resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G 300 246.0
46 vgg_16 tf_vgg16_imagenet_224_224_30.96G 300 161.5
47 vgg_19 tf_vgg19_imagenet_224_224_39.28G 300 134.8
48 ssd_resnet_50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G 300 32.2
49 yolov3_voc tf_yolov3_voc_416_416_65.63G 300 80.1
50 mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G 300 13.5
51 mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G 300 563.4
52 refinedet tf_refinedet_VOC_320_320_81.9G 300 69.5
53 refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G 300 422.0
54 resnet50 tf2_resnet50_imagenet_224_224_7.76G 300 583.6
55 Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G 300 372.5
56 2d-unet tf2_2d-unet_nuclei_128_128_5.31G 300 1064.7
57 ERFNet tf2_erfnet_cityscapes_512_1024_54G 300 49.3
58 ENet pt_ENet_cityscapes_512_1024_8.6G 300 74.9
59 SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G 300 456.0
60 ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G 300 1235.9
61 face quality pt_face-quality_80_60_61.68M 300 16458.3
62 face_reid_large pt_facereid-large_96_96_515M 300 7468.9
63 face_reid_small pt_facereid-small_80_80_90M 300 17815.5
64 person_reid pt_personreid-res50_market1501_256_128_5.4G 300 916.4
65 person_reid pt_personreid-res18_market1501_176_80_1.1G 300 3541.5
66 salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G 300 129.8
67 FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G 300 212.6
68 2d-unet pt_unet_chaos-CT_512_512_23.3G 300 58.3
69 Inception_v3 torchvision_inception_v3 300 411.3
70 SqueezeNet torchvision_squeezenet 300 1987.7
71 resnet50 torchvision_resnet50 300 624.4

Performance on U50 lv9e

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U50 board with 9 DPUv3E kernels running at 275Mhz in Gen3x4:

No. Model Name Frequency(MHz) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 275 789.1
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 275 1944.9
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 275 1643.2
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 275 1337.0
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 275 565.0
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 275 259.6
7 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 275 3926.6
8 ssd_pedestrian_pruned_0_97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 275 728.0
9 refinedet_baseline cf_refinedet_coco_360_480_123G 275 75.8
10 refinedet_pruned_0_8 cf_refinedet_coco_360_480_0.8_25G 275 284.6
11 refinedet_pruned_0_92 cf_refinedet_coco_360_480_0.92_10.10G 275 606.9
12 refinedet_pruned_0_96 cf_refinedet_coco_360_480_0.96_5.08G 275 875.3
13 ssd_adas_pruned_0_95 cf_ssdadas_bdd_360_480_0.95_6.3G 275 896.1
14 ssd_traffic_pruned_0_9 cf_ssdtraffic_360_480_0.9_11.6G 275 587.3
15 VPGnet_pruned_0_99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 275 582.5
16 FPN cf_fpn_cityscapes_256_512_8.9G 275 544.1
17 SP_net cf_SPnet_aichallenger_224_128_0.54G 275 2802.2
18 Openpose_pruned_0_3 cf_openpose_aichallenger_368_368_0.3_189.7G 275 44.0
19 densebox_320_320 cf_densebox_wider_320_320_0.49G 275 2317.5
20 densebox_640_360 cf_densebox_wider_360_640_1.11G 275 1020.0
21 face_landmark cf_landmark_celeba_96_72_0.14G 275 13350.3
22 reid cf_reid_market1501_160_80_0.95G 275 5250.9
23 multi_task cf_multitask_bdd_288_512_14.8G 275 391.1
24 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 275 104.2
25 yolov3_adas_pruned_0_9 dk_yolov3_cityscapes_256_512_0.9_5.46G 275 851.2
26 yolov3_voc dk_yolov3_voc_416_416_65.42G 275 106.3
27 yolov2_voc dk_yolov2_voc_448_448_34G 275 225.0
28 yolov2_voc_pruned_0_66 dk_yolov2_voc_448_448_0.66_11.56G 275 566.9
29 yolov2_voc_pruned_0_71 dk_yolov2_voc_448_448_0.71_9.86G 275 664.3
30 yolov2_voc_pruned_0_77 dk_yolov2_voc_448_448_0.77_7.82G 275 807.3
31 ResNet20-face cf_facerec-resnet20_112_96_3.5G 275 1830.6
32 ResNet64-face cf_facerec-resnet64_112_96_11G 275 676.4
33 FPN_Res18_segmentation cf_FPN-resnet18_EDD_320_320_45.3G 275 140.0
34 plate detection cf_plate-detection_320_320_0.49G 275 5225.8
35 plate recognition cf_plate-recognition_96_288_1.75G 275 1281.5
36 face_quality cf_face-quality_80_60_61.68M 275 18211.6
37 tiny-yolov3 dk_tiny-yolov3_416_416_5.46G 275 1193.5
38 yolov4 dk_yolov4_coco_416_416_60.1G 275 111.4
39 Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G 275 238.0
40 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 275 1705.9
41 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 275 564.7
42 Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G 275 259.7
43 resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G 275 885.8
44 resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G 275 459.8
45 resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G 275 306.6
46 vgg_16 tf_vgg16_imagenet_224_224_30.96G 275 222.0
47 vgg_19 tf_vgg19_imagenet_224_224_39.28G 275 185.3
48 ssd_resnet_50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G 275 43.1
49 yolov3_voc tf_yolov3_voc_416_416_65.63G 275 106.4
50 mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G 275 18.8
51 mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G 275 689.5
52 refinedet tf_refinedet_VOC_320_320_81.9G 275 97.8
53 refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G 275 579.6
54 resnet50 tf2_resnet50_imagenet_224_224_7.76G 275 711.6
55 Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G 275 510.1
56 2d-unet tf2_2d-unet_nuclei_128_128_5.31G 275 1453.6
57 ERFNet tf2_erfnet_cityscapes_512_1024_54G 275 65.4
58 ENet pt_ENet_cityscapes_512_1024_8.6G 275 88.6
59 SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G 275 563.2
60 ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G 275 1668.6
61 face quality pt_face-quality_80_60_61.68M 275 17832.6
62 face_reid_large pt_facereid-large_96_96_515M 275 9523.9
63 face_reid_small pt_facereid-small_80_80_90M 275 19805.9
64 person_reid pt_personreid-res50_market1501_256_128_5.4G 275 1108.4
65 person_reid pt_personreid-res18_market1501_176_80_1.1G 275 4645.1
66 salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G 275 134.9
67 FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G 275 286.5
68 2d-unet pt_unet_chaos-CT_512_512_23.3G 275 79.8
69 Inception_v3 torchvision_inception_v3 275 564.4
70 SqueezeNet torchvision_squeezenet 275 2543.0
71 resnet50 torchvision_resnet50 275 764.1

Performance on U50 lv10e

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U50 board with 10 DPUv3E kernels running at 275Mhz in Gen3x4:

No. Model Name Frequency(MHz) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 275 790.3
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 275 1936.8
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 275 1579.3
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 275 1277.6
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 275 488.3
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 275 223.3
7 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 275 3501.0
8 ssd_pedestrian_pruned_0_97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 275 620.5
9 refinedet_baseline cf_refinedet_coco_360_480_123G 275 26.0
10 refinedet_pruned_0_8 cf_refinedet_coco_360_480_0.8_25G 275 249.4
11 refinedet_pruned_0_92 cf_refinedet_coco_360_480_0.92_10.10G 275 572.3
12 refinedet_pruned_0_96 cf_refinedet_coco_360_480_0.96_5.08G 275 783.0
13 ssd_adas_pruned_0_95 cf_ssdadas_bdd_360_480_0.95_6.3G 275 727.9
14 ssd_traffic_pruned_0_9 cf_ssdtraffic_360_480_0.9_11.6G 275 507.7
15 VPGnet_pruned_0_99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 275 573.7
16 FPN cf_fpn_cityscapes_256_512_8.9G 275 567.9
17 SP_net cf_SPnet_aichallenger_224_128_0.54G 275 1929.1
18 Openpose_pruned_0_3 cf_openpose_aichallenger_368_368_0.3_189.7G 275 33.3
19 densebox_320_320 cf_densebox_wider_320_320_0.49G 275 2394.8
20 densebox_640_360 cf_densebox_wider_360_640_1.11G 275 1041.3
21 face_landmark cf_landmark_celeba_96_72_0.14G 275 14694.9
22 reid cf_reid_market1501_160_80_0.95G 275 5797.1
23 multi_task cf_multitask_bdd_288_512_14.8G 275 368.1
24 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 275 100.9
25 yolov3_adas_pruned_0_9 dk_yolov3_cityscapes_256_512_0.9_5.46G 275 787.1
26 yolov3_voc dk_yolov3_voc_416_416_65.42G 275 102.6
27 yolov2_voc dk_yolov2_voc_448_448_34G 275 220.8
28 yolov2_voc_pruned_0_66 dk_yolov2_voc_448_448_0.66_11.56G 275 555.2
29 yolov2_voc_pruned_0_71 dk_yolov2_voc_448_448_0.71_9.86G 275 651.6
30 yolov2_voc_pruned_0_77 dk_yolov2_voc_448_448_0.77_7.82G 275 790.6
31 ResNet20-face cf_facerec-resnet20_112_96_3.5G 275 2030.8
32 ResNet64-face cf_facerec-resnet64_112_96_11G 275 751.0
33 FPN_Res18_segmentation cf_FPN-resnet18_EDD_320_320_45.3G 275 139.5
34 plate detection cf_plate-detection_320_320_0.49G 275 5160.8
35 plate recognition cf_plate-recognition_96_288_1.75G 275 1070.1
36 face_quality cf_face-quality_80_60_61.68M 275 19352.6
37 tiny-yolov3 dk_tiny-yolov3_416_416_5.46G 275 1317.2
38 yolov4 dk_yolov4_coco_416_416_60.1G 275 119.8
39 Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G 275 210.1
40 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 275 1643.1
41 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 275 488.5
42 Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G 275 223.5
43 resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G 275 887.3
44 resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G 275 460.6
45 resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G 275 307.0
46 vgg_16 tf_vgg16_imagenet_224_224_30.96G 275 225.1
47 vgg_19 tf_vgg19_imagenet_224_224_39.28G 275 187.3
48 ssd_resnet_50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G 275 41.2
49 yolov3_voc tf_yolov3_voc_416_416_65.63G 275 104.3
50 mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G 275 18.5
51 mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G 275 765.7
52 refinedet tf_refinedet_VOC_320_320_81.9G 275 105.1
53 refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G 275 630.8
54 resnet50 tf2_resnet50_imagenet_224_224_7.76G 275 790.3
55 Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G 275 554.9
56 2d-unet tf2_2d-unet_nuclei_128_128_5.31G 275 1577.9
57 ERFNet tf2_erfnet_cityscapes_512_1024_54G 275 64.5
58 ENet pt_ENet_cityscapes_512_1024_8.6G 275 85.5
59 SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G 275 591.9
60 ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G 275 1850.8
61 face quality pt_face-quality_80_60_61.68M 275 19209.5
62 face_reid_large pt_facereid-large_96_96_515M 275 10486.7
63 face_reid_small pt_facereid-small_80_80_90M 275 20809.8
64 person_reid pt_personreid-res50_market1501_256_128_5.4G 275 1230.2
65 person_reid pt_personreid-res18_market1501_176_80_1.1G 275 5090.4
66 salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G 275 137.6
67 FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G 275 313.2
68 2d-unet pt_unet_chaos-CT_512_512_23.3G 275 70.6
69 Inception_v3 torchvision_inception_v3 275 488.5
70 SqueezeNet torchvision_squeezenet 275 2334.1
71 resnet50 torchvision_resnet50 275 765.6

Performance on U200

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U200 board with 2 DPUCADF8H kernels running at 300Mhz:

No. Model Name E2E latency (ms) Thread num =20 E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 3.8 1054
2 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 2.2 1834
3 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 18.4 218
4 resnetv1_50 tf_resnetv1_50_imagenet_224_224_6.97G 4.2 947
5 resnetv1_101 tf_resnetv1_101_imagenet_224_224_14.4G 8.5 472
6 resnetv1_152 tf_resnetv1_152_imagenet_224_224_21.83G 12.7 316

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U200 board with 2 DPUCADX8G kernels running at 350Mhz with xilinx_u200_xdma_201830_2 shell:

No. Model Name E2E latency (ms) Thread num =1 E2E throughput -fps(Single Thread) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 2.13 470.6 561.3
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 2.08 481 1157.8
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 2.39 418.5 1449.4
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 2.11 475.1 1129.2
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 15.67 63.8 371.6
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 10.77 92.8 221.2
7 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 10.99 91 1157.1
8 densebox_320_320 cf_densebox_wider_320_320_0.49G 8.69 115.1 667.9
9 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 14.53 68.8 75.9
10 yolov3_voc dk_yolov3_voc_416_416_65.42G 19.90 50.3 82.1

Performance on U250

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U250 board with 4 DPUCADF8H kernels running at 300Mhz:

No. Model Name E2E latency (ms) Thread num =20 E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 1.94 2134.8
2 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 1.10 3631.7
3 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 9.20 434.9
4 resnetv1_50 tf_resnetv1_50_imagenet_224_224_6.97G 2.13 1881.6
5 resnetv1_101 tf_resnetv1_101_imagenet_224_224_14.4G 4.24 941.9
6 resnetv1_152 tf_resnetv1_152_imagenet_224_224_21.83G 6.35 630.3

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U250 board with 4 DPUCADX8G kernels running at 350Mhz with xilinx_u250_xdma_201830_1 shell:

No. Model Name E2E latency (ms) Thread num =1 E2E throughput -fps(Single Thread) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 1.68 595.5 1223.95
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 1.67 600.5 2422.5
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 1.93 517.1 4059.8
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 1.65 607.8 23221
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 6.18 161.8 743.8
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 5.77 173.4 452.4
7 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 5.44 183.7 2349.7
8 densebox_320_320 cf_densebox_wider_320_320_0.49G 7.43 167.2 898.5
9 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 14.27 70.1 146.7
10 yolov3_voc dk_yolov3_voc_416_416_65.42G 9.46 105.7 139.4

Performance on U280

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Alveo U280 board with 14 DPUv3E kernels running at 300Mhz in Gen3x4:

No. Model Name Frequency(MHz) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 300 801.2
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 300 1649.2
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 300 1235.5
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 300 1055.4
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 300 416.3
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 300 189.9
7 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 300 2899.2
8 ssd_pedestrian_pruned_0_97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 300 434.9
9 refinedet_baseline cf_refinedet_coco_360_480_123G 300 46.4
10 refinedet_pruned_0_8 cf_refinedet_coco_360_480_0.8_25G 300 171.2
11 refinedet_pruned_0_92 cf_refinedet_coco_360_480_0.92_10.10G 300 427.2
12 refinedet_pruned_0_96 cf_refinedet_coco_360_480_0.96_5.08G 300 591
13 ssd_adas_pruned_0_95 cf_ssdadas_bdd_360_480_0.95_6.3G 300 570
14 ssd_traffic_pruned_0_9 cf_ssdtraffic_360_480_0.9_11.6G 300 310.2
15 VPGnet_pruned_0_99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 300 581.5
16 FPN cf_fpn_cityscapes_256_512_8.9G 300 374.9
17 SP_net cf_SPnet_aichallenger_224_128_0.54G 300 2449.3
18 Openpose_pruned_0_3 cf_openpose_aichallenger_368_368_0.3_189.7G 300 35.3
19 densebox_320_320 cf_densebox_wider_320_320_0.49G 300 2924.8
20 densebox_640_360 cf_densebox_wider_360_640_1.11G 300 1300.1
21 face_landmark cf_landmark_celeba_96_72_0.14G 300 12371
22 reid cf_reid_market1501_160_80_0.95G 300 4765.6
23 multi_task cf_multitask_bdd_288_512_14.8G 300 226.6
24 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 300 92
25 yolov3_adas_pruned_0_9 dk_yolov3_cityscapes_256_512_0.9_5.46G 300 790.3
26 yolov3_voc dk_yolov3_voc_416_416_65.42G 300 96.5
27 yolov2_voc dk_yolov2_voc_448_448_34G 300 196.9
28 yolov2_voc_pruned_0_66 dk_yolov2_voc_448_448_0.66_11.56G 300 495.7
29 yolov2_voc_pruned_0_71 dk_yolov2_voc_448_448_0.71_9.86G 300 579.7
30 yolov2_voc_pruned_0_77 dk_yolov2_voc_448_448_0.77_7.82G 300 691.7
31 ResNet20-face cf_facerec-resnet20_112_96_3.5G 300 1598.4
32 ResNet64-face cf_facerec-resnet64_112_96_11G 300 580.6
33 FPN_Res18_segmentation cf_FPN-resnet18_EDD_320_320_45.3G 300 107.4
34 plate detection cf_plate-detection_320_320_0.49G 300 4288.6
35 plate recognition cf_plate-recognition_96_288_1.75G 300 1100.8
36 face_quality cf_face-quality_80_60_61.68M 300 23507
37 tiny-yolov3 dk_tiny-yolov3_416_416_5.46G 300 920.2
38 yolov4 dk_yolov4_coco_416_416_60.1G 300 95.3
39 Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G 300 186.4
40 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 300 1293.2
41 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 300 414.8
42 Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G 300 189.6
43 resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G 300 751.7
44 resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G 300 390.1
45 resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G 300 259.9
46 vgg_16 tf_vgg16_imagenet_224_224_30.96G 300 183.7
47 vgg_19 tf_vgg19_imagenet_224_224_39.28G 300 153.9
48 ssd_resnet_50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G 300 30.7
49 yolov3_voc tf_yolov3_voc_416_416_65.63G 300 95.7
50 mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G 300 11.2
51 mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G 300 647.3
52 refinedet tf_refinedet_VOC_320_320_81.9G 300 85.2
53 refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G 300 498.7
54 resnet50 tf2_resnet50_imagenet_224_224_7.76G 300 668.2
55 Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G 300 437.5
56 2d-unet tf2_2d-unet_nuclei_128_128_5.31G 300 102.5
57 ERFNet tf2_erfnet_cityscapes_512_1024_54G 300 54.1
58 ENet pt_ENet_cityscapes_512_1024_8.6G 300 88.5
59 SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G 300 408.9
60 ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G 300 1598.4
61 face quality pt_face-quality_80_60_61.68M 300 23286
62 face_reid_large pt_facereid-large_96_96_515M 300 9164.1
63 face_reid_small pt_facereid-small_80_80_90M 300 20088.5
64 person_reid pt_personreid-res50_market1501_256_128_5.4G 300 1044.9
65 person_reid pt_personreid-res18_market1501_176_80_1.1G 300 4521.9
66 salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G 300 105
67 FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G 300 221.2
68 2d-unet pt_unet_chaos-CT_512_512_23.3G 300 102.5
69 Inception_v3 torchvision_inception_v3 300 415.5
70 SqueezeNet torchvision_squeezenet 300 1780.7
71 resnet50 torchvision_resnet50 300 776.2

Performance on VCK190

Measured with Vitis AI 1.3 and Vitis AI Library 1.3

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Versal VCK190 board with 96 AIEs running at 1333Mhz:

No. Model Name Frequency(MHz) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 300 1295.9
2 resnet18 cf_resnet18_imagenet_224_224_3.65G 300 2831.2
3 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 300 1501.7
4 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 300 1089.9
5 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 300 572.6
6 Inception_v4 cf_inceptionv4_imagenet_299_299_24.5G 300 278.8
7 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 300 1794.7
8 ssd_pedestrian_pruned_0_97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 300 673.6
9 refinedet_baseline cf_refinedet_coco_360_480_123G 300 138.8
10 refinedet_pruned_0_8 cf_refinedet_coco_360_480_0.8_25G 300 385.9
11 refinedet_pruned_0_92 cf_refinedet_coco_360_480_0.92_10.10G 300 593.6
12 refinedet_pruned_0_96 cf_refinedet_coco_360_480_0.96_5.08G 300 796.9
13 ssd_adas_pruned_0_95 cf_ssdadas_bdd_360_480_0.95_6.3G 300 720.4
14 ssd_traffic_pruned_0_9 cf_ssdtraffic_360_480_0.9_11.6G 300 622.8
15 VPGnet_pruned_0_99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 300 463.8
16 FPN cf_fpn_cityscapes_256_512_8.9G 300 196.6
17 SP_net cf_SPnet_aichallenger_224_128_0.54G 300 3408.7
18 Openpose_pruned_0_3 cf_openpose_aichallenger_368_368_0.3_189.7G 300 43.6
19 densebox_320_320 cf_densebox_wider_320_320_0.49G 300 2146.6
20 densebox_640_360 cf_densebox_wider_360_640_1.11G 300 1040
21 face_landmark cf_landmark_celeba_96_72_0.14G 300 10031.2
22 reid cf_reid_market1501_160_80_0.95G 300 4825.8
23 multi_task cf_multitask_bdd_288_512_14.8G 300 197.2
24 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 300 185
25 yolov3_adas_pruned_0_9 dk_yolov3_cityscapes_256_512_0.9_5.46G 300 866.8
26 yolov3_voc dk_yolov3_voc_416_416_65.42G 300 185
27 yolov2_voc dk_yolov2_voc_448_448_34G 300 439.2
28 yolov2_voc_pruned_0_66 dk_yolov2_voc_448_448_0.66_11.56G 300 879.8
29 yolov2_voc_pruned_0_71 dk_yolov2_voc_448_448_0.71_9.86G 300 978
30 yolov2_voc_pruned_0_77 dk_yolov2_voc_448_448_0.77_7.82G 300 1093.9
31 ResNet20-face cf_facerec-resnet20_112_96_3.5G 300 2561.6
32 ResNet64-face cf_facerec-resnet64_112_96_11G 300 1260.4
33 FPN_Res18_segmentation cf_FPN-resnet18_EDD_320_320_45.3G 300 75.5
34 plate detection cf_plate-detection_320_320_0.49G 300 2518.1
35 plate recognition cf_plate-recognition_96_288_1.75G 300 716.3
36 face_quality cf_face-quality_80_60_61.68M 300 18695.5
37 tiny-yolov3 dk_tiny-yolov3_416_416_5.46G 300 1322.7
38 yolov4 dk_yolov4_coco_416_416_60.1G 300 147.4
39 Inception_resnet_v2 tf_inceptionresnetv2_imagenet_299_299_26.35G 300 288.3
40 Inception_v1 tf_inceptionv1_imagenet_224_224_3G 300 1523.5
41 Inception_v3 tf_inceptionv3_imagenet_299_299_11.45G 300 572.7
42 Inception_v4 tf_inceptionv4_imagenet_299_299_24.55G 300 278.7
43 resnet_v1_50 tf_resnetv1_50_imagenet_224_224_6.97G 300 1385.5
44 resnet_v1_101 tf_resnetv1_101_imagenet_224_224_14.4G 300 779
45 resnet_v1_152 tf_resnetv1_152_imagenet_224_224_21.83G 300 546.5
46 vgg_16 tf_vgg16_imagenet_224_224_30.96G 300 336
47 vgg_19 tf_vgg19_imagenet_224_224_39.28G 300 302
48 ssd_resnet_50_v1_fpn tf_ssdresnet50v1_fpn_coco_640_640_178.4G 300 11.8
49 yolov3_voc tf_yolov3_voc_416_416_65.63G 300 184
50 mlperf_ssd_resnet34 tf_mlperf_resnet34_coco_1200_1200_433G 300 22.8
51 mlperf_resnet50 tf_mlperf_resnet50_imagenet_224_224_8.19G 300 1288.3
52 refinedet tf_refinedet_VOC_320_320_81.9G 300 175.7
53 refinedet_medical tf_RefineDet-Medical_EDD_320_320_9.83G 300 747.3
54 resnet50 tf2_resnet50_imagenet_224_224_7.76G 300 1275.1
55 Inception_v3 tf2_inceptionv3_imagenet_299_299_11.5G 300 603.6
56 2d-unet tf2_2d-unet_nuclei_128_128_5.31G 300 1746.5
57 ERFNet tf2_erfnet_cityscapes_512_1024_54G 300 48.9
58 ENet pt_ENet_cityscapes_512_1024_8.6G 300 49.7
59 SemanticFPN pt_SemanticFPN_cityscapes_256_512_10G 300 196.9
60 ResNet20-face pt_facerec-resnet20_mixed_112_96_3.5G 300 2561.4
61 face quality pt_face-quality_80_60_61.68M 300 19593.9
62 face_reid_large pt_facereid-large_96_96_515M 300 11841.2
63 face_reid_small pt_facereid-small_80_80_90M 300 16828.4
64 person_reid pt_personreid-res50_market1501_256_128_5.4G 300 1804.7
65 person_reid pt_personreid-res18_market1501_176_80_1.1G 300 4666.3
66 salsanext pt_salsanext_semantic-kitti_64_2048_0.6_20.4G 300 21.7
67 FPN-R18 (light-weight) pt_FPN-resnet18_covid19-seg_352_352_22.7G 300 330.4
68 2d-unet pt_unet_chaos-CT_512_512_23.3G 300 148.7
69 Inception_v3 torchvision_inception_v3 300 567.9
70 SqueezeNet torchvision_squeezenet 300 1677.4
71 resnet50 torchvision_resnet50 300 1267.9

Performance on Ultra96

The performance number shown below was measured with the previous AI SDK v2.0.4 on Ultra96 v1. The Vitis platform of Ultra96 v2 has not been released yet. So the performance numbers are therefore not reported for this Model Zoo release.

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Ultra96 board with a 1 * B1600 @ 287MHz V1.4.0 DPU configuration:

Note: The original power supply of Ultra96 is not designed for high performance AI workload. The board may occasionally hang to run few models, When multi-thread is used. For such situations, NA is specified in the following table.

No. Model Name E2E latency (ms) Thread num =1 E2E throughput -fps(Single Thread) E2E throughput -fps(Multi Thread)
1 resnet50 cf_resnet50_imagenet_224_224_7.7G 30.8 32.4667 33.4667
2 Inception_v1 cf_inceptionv1_imagenet_224_224_3.16G 13.98 71.55 75.0667
3 Inception_v2 cf_inceptionv2_imagenet_224_224_4G 17.16 58.2667 61.2833
4 Inception_v3 cf_inceptionv3_imagenet_299_299_11.4G 44.05 22.7 23.4333
5 mobileNet_v2 cf_mobilenetv2_imagenet_224_224_0.59G 7.34 136.183 NA
6 tf_resnet50 tf_resnet50_imagenet_224_224_6.97G 28.02 35.6833 36.6
7 tf_inception_v1 tf_inceptionv1_imagenet_224_224_3G 16.96 58.9667 61.2833
8 tf_mobilenet_v2 tf_mobilenetv2_imagenet_224_224_1.17G 10.17 98.3 104.25
9 ssd_adas_pruned_0.95 cf_ssdadas_bdd_360_480_0.95_6.3G 24.3 41.15 46.2
10 ssd_pedestrian_pruned_0.97 cf_ssdpedestrian_coco_360_640_0.97_5.9G 23.29 42.9333 50.8
11 ssd_traffic_pruned_0.9 cf_ssdtraffic_360_480_0.9_11.6G 35.5 28.1667 31.8
12 ssd_mobilnet_v2 cf_ssdmobilenetv2_bdd_360_480_6.57G 60.79 16.45 27.8167
13 tf_ssd_voc tf_ssd_voc_300_300_64.81G 186.92 5.35 5.81667
14 densebox_320_320 cf_densebox_wider_320_320_0.49G 4.17 239.883 334.167
15 densebox_360_640 cf_densebox_wider_360_640_1.11G 8.55 117 167.2
16 yolov3_adas_prune_0.9 dk_yolov3_cityscapes_256_512_0.9_5.46G 22.79 43.8833 49.6833
17 yolov3_voc dk_yolov3_voc_416_416_65.42G 185.19 5.4 5.53
18 tf_yolov3_voc tf_yolov3_voc_416_416_65.63G 199.34 5.01667 5.1
19 refinedet_pruned_0.8 cf_refinedet_coco_360_480_0.8_25G 66.37 15.0667 NA
20 refinedet_pruned_0.92 cf_refinedet_coco_360_480_0.92_10.10G 32.17 31.0883 33.6667
21 refinedet_pruned_0.96 cf_refinedet_coco_360_480_0.96_5.08G 20.29 49.2833 55.25
22 FPN cf_fpn_cityscapes_256_512_8.9G 36.34 27.5167 NA
23 VPGnet_pruned_0.99 cf_VPGnet_caltechlane_480_640_0.99_2.5G 13.9 71.9333 NA
24 SP-net cf_SPnet_aichallenger_224_128_0.54G 3.82 261.55 277.4
25 Openpose_pruned_0.3 cf_openpose_aichallenger_368_368_0.3_189.7G 560.75 1.78333 NA
26 yolov2_voc dk_yolov2_voc_448_448_34G 118.11 8.46667 8.9
27 yolov2_voc_pruned_0.66 dk_yolov2_voc_448_448_0.66_11.56G 37.5 26.6667 30.65
28 yolov2_voc_pruned_0.71 dk_yolov2_voc_448_448_0.71_9.86G 30.99 32.2667 38.35
29 yolov2_voc_pruned_0.77 dk_yolov2_voc_448_448_0.77_7.82G 26.29 38.03333 46.8333
30 Inception-v4 cf_inceptionv4_imagenet_299_299_24.5G 88.76 11.2667 11.5333
31 SqueezeNet cf_squeeze_imagenet_227_227_0.76G 5.96 167.867 283.583
32 face_landmark cf_landmark_celeba_96_72_0.14G 2.95 339.183 347.633
33 reid cf_reid_market1501_160_80_0.95G 6.28 159.15 166.633
34 yolov3_bdd dk_yolov3_bdd_288_512_53.7G 193.55 5.16667 5.31667
35 tf_mobilenet_v1 tf_mobilenetv1_imagenet_224_224_1.14G 5.97 167.567 186.55
36 resnet18 cf_resnet18_imagenet_224_224_3.65G 13.47 74.2167 77.8167
37 resnet18_wide tf_resnet18_imagenet_224_224_28G 97.72 10.2333 10.3833

Contributing

We welcome community contributions. When contributing to this repository, first discuss the change you wish to make via:

You can also submit a pull request with details on how to improve the product. Prior to submitting your pull request, ensure that you can build the product and run all the demos with your patch. In case of a larger feature, provide a relevant demo.

License

Xilinx AI Model Zoo is licensed under Apache License Version 2.0. By contributing to the project, you agree to the license and copyright terms therein and release your contribution under these terms.


Copyright© 2019 Xilinx