diff --git a/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8/README.md b/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8/README.md index 42a903a..6c333ad 100644 --- a/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8/README.md +++ b/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8/README.md @@ -1,7 +1,7 @@ # MobileNet v2 1.0 224 UINT8 ## Description -MobileNet v2 is an efficient image classification neural network, targeted for mobile and embedded usecases. This model is trained on the ImageNet dataset and is quantized to the UINT8 datatype by Google. +MobileNet v2 is an efficient image classification neural network, targeted for mobile and embedded use cases. This model is trained on the ImageNet dataset and is quantized to the UINT8 datatype by Google. ## License [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) @@ -13,7 +13,7 @@ MobileNet v2 is an efficient image classification neural network, targeted for m | SHA-1 Hash | 275c9649cb395139103b6d15f53011b1b949ad00 | | Size (Bytes) | 3577760 | | Provenance | https://tfhub.dev/tensorflow/lite-model/mobilenet_v2_1.0_224_quantized/1/default/1 | -| Paper | https://arxiv.org/pdf/1704.04861.pdf | +| Paper | https://arxiv.org/pdf/1801.04381.pdf | ## Accuracy Dataset: Ilsvrc 2012 diff --git a/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8/definition.yaml b/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8/definition.yaml new file mode 100644 index 0000000..b4f73ae --- /dev/null +++ b/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8/definition.yaml @@ -0,0 +1,38 @@ +benchmark: + ILSVRC 2012: + top_1_accuracy: 0.708 +description: MobileNet v2 is an efficient image classification neural network, targeted + for mobile and embedded use cases. This model is trained on the ImageNet dataset + and is quantized to the UINT8 datatype by Google. +license: Apache-2.0 +network: + file_size_bytes: 3577760 + filename: mobilenet_v2_1.0_224_quantized_1_default_1.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 275c9649cb395139103b6d15f53011b1b949ad00 + provenance: https://tfhub.dev/tensorflow/lite-model/mobilenet_v2_1.0_224_quantized/1/default/1 +network_parameters: + input_nodes: + - description: Single 224x224 RGB image with UINT8 values between 0 and 255 + name: input + shape: + - 1 + - 224 + - 224 + - 3 + output_nodes: + - description: Per-class confidence for 1001 ImageNet classes + name: output + shape: + - 1 + - 1001 +operators: + TensorFlow Lite: + - ADD + - AVERAGE_POOL_2D + - CONV_2D + - DEPTHWISE_CONV_2D + - RESHAPE +paper: https://arxiv.org/pdf/1801.04381.pdf diff --git a/models/keyword_spotting/cnn_large/tflite_int8/definition.yaml b/models/keyword_spotting/cnn_large/tflite_int8/definition.yaml new file mode 100644 index 0000000..fcc2516 --- /dev/null +++ b/models/keyword_spotting/cnn_large/tflite_int8/definition.yaml @@ -0,0 +1,39 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 92.92% +description: 'This is a fully quantized version (asymmetrical int8) of the CNN Large + model developed by Arm, with training checkpoints, from the Hello Edge paper. Code + to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 486560 + filename: cnn_l_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 59e6986c3eca496fa3d54176ac66bb7dc9ff36e0 + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 490) + name: input + shape: + - 1 + - 490 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - CONV_2D + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - RESHAPE + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/cnn_medium/tflite_int8/definition.yaml b/models/keyword_spotting/cnn_medium/tflite_int8/definition.yaml new file mode 100644 index 0000000..6575536 --- /dev/null +++ b/models/keyword_spotting/cnn_medium/tflite_int8/definition.yaml @@ -0,0 +1,39 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 91.33% +description: 'This is a fully quantized version (asymmetrical int8) of the CNN Medium + model developed by Arm, with training checkpoints, from the Hello Edge paper. Code + to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 187840 + filename: cnn_m_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 389c6c2c7d289c0018e2dabcc66271811e52874c + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 490) + name: input + shape: + - 1 + - 490 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - CONV_2D + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - RESHAPE + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/cnn_small/tflite_int8/definition.yaml b/models/keyword_spotting/cnn_small/tflite_int8/definition.yaml new file mode 100644 index 0000000..c5516c2 --- /dev/null +++ b/models/keyword_spotting/cnn_small/tflite_int8/definition.yaml @@ -0,0 +1,39 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 91.41% +description: 'This is a fully quantized version (asymmetrical int8) of the CNN Small + model developed by Arm, with training checkpoints, from the Hello Edge paper. Code + to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 76752 + filename: cnn_s_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: d3c8f4b468545d7012383f2a312bef6245a3b599 + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 490) + name: input + shape: + - 1 + - 490 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - CONV_2D + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - RESHAPE + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/dnn_large/tflite_int8/definition.yaml b/models/keyword_spotting/dnn_large/tflite_int8/definition.yaml new file mode 100644 index 0000000..b47f31e --- /dev/null +++ b/models/keyword_spotting/dnn_large/tflite_int8/definition.yaml @@ -0,0 +1,37 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 86.28% +description: 'This is a fully quantized version (asymmetrical int8) of the DNN Large + model developed by Arm, with training checkpoints, from the Hello Edge paper. Code + to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 502928 + filename: dnn_l_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 16e03dda20ae81dfba6a567e6e7563ca67596969 + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 250) + name: input + shape: + - 1 + - 250 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/dnn_medium/tflite_int8/definition.yaml b/models/keyword_spotting/dnn_medium/tflite_int8/definition.yaml new file mode 100644 index 0000000..60013ac --- /dev/null +++ b/models/keyword_spotting/dnn_medium/tflite_int8/definition.yaml @@ -0,0 +1,37 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 84.64% +description: 'This is a fully quantized version (asymmetrical int8) of the DNN Medium + model developed by Arm, with training checkpoints, from the Hello Edge paper. Code + to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 204480 + filename: dnn_m_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 57ad3cf78f736819b8897f5de51f7e9a4cbd5689 + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 250) + name: input + shape: + - 1 + - 250 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/dnn_small/tflite_int8/definition.yaml b/models/keyword_spotting/dnn_small/tflite_int8/definition.yaml new file mode 100644 index 0000000..64cf0f7 --- /dev/null +++ b/models/keyword_spotting/dnn_small/tflite_int8/definition.yaml @@ -0,0 +1,37 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 82.70% +description: 'This is a fully quantized version (asymmetrical int8) of the DNN Small + model developed by Arm, with training checkpoints, from the Hello Edge paper. Code + to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 84192 + filename: dnn_s_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 5b00a7eb54eb2650c50026ddef2b3134a71ab6cf + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 250) + name: input + shape: + - 1 + - 250 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/ds_cnn_large/tflite_int8/definition.yaml b/models/keyword_spotting/ds_cnn_large/tflite_int8/definition.yaml new file mode 100644 index 0000000..47bc96d --- /dev/null +++ b/models/keyword_spotting/ds_cnn_large/tflite_int8/definition.yaml @@ -0,0 +1,41 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 94.58% +description: 'This is a fully quantized version (asymmetrical int8) of the DS-CNN + Large model developed by Arm, with training checkpoints, from the Hello Edge paper. + Code to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 530688 + filename: ds_cnn_l_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: abaa9d4bf8797801276c00151ee14426aa1b2dcc + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 490) + name: input + shape: + - 1 + - 490 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - AVERAGE_POOL_2D + - CONV_2D + - DEPTHWISE_CONV_2D + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - RESHAPE + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/ds_cnn_medium/tflite_int8/definition.yaml b/models/keyword_spotting/ds_cnn_medium/tflite_int8/definition.yaml new file mode 100644 index 0000000..60b973d --- /dev/null +++ b/models/keyword_spotting/ds_cnn_medium/tflite_int8/definition.yaml @@ -0,0 +1,41 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 93.35% +description: 'This is a fully quantized version (asymmetrical int8) of the DS-CNN + Medium model developed by Arm, with training checkpoints, from the Hello Edge paper. + Code to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 200928 + filename: ds_cnn_m_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: c6923b02806224775b58ab9bc11e03e021ff407e + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 490) + name: input + shape: + - 1 + - 490 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - AVERAGE_POOL_2D + - CONV_2D + - DEPTHWISE_CONV_2D + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - RESHAPE + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/keyword_spotting/ds_cnn_small/tflite_int8/definition.yaml b/models/keyword_spotting/ds_cnn_small/tflite_int8/definition.yaml new file mode 100644 index 0000000..bad2ee9 --- /dev/null +++ b/models/keyword_spotting/ds_cnn_small/tflite_int8/definition.yaml @@ -0,0 +1,41 @@ +benchmark: + Google Speech Commands test set: + Accuracy: 93.35% +description: 'This is a fully quantized version (asymmetrical int8) of the DS-CNN + Small model developed by Arm, with training checkpoints, from the Hello Edge paper. + Code to recreate this model can be found here: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m' +license: +- Apache-2.0 +network: + file_size_bytes: 54464 + filename: ds_cnn_s_quantized.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 9083414c5f3d850bae6599a038f711dd1f21c9c7 + provenance: https://github.com/ARM-software/ML-examples/tree/master/tflu-kws-cortex-m +network_parameters: + input_nodes: + - description: The input is a processed MFCCs of shape (1, 490) + name: input + shape: + - 1 + - 490 + output_nodes: + - description: The probability on 12 keywords. + name: Identity + shape: + - 1 + - 12 +operators: + TensorFlow Lite: + - AVERAGE_POOL_2D + - CONV_2D + - DEPTHWISE_CONV_2D + - DEQUANTIZE + - FULLY_CONNECTED + - QUANTIZE + - RELU + - RESHAPE + - SOFTMAX +paper: https://arxiv.org/abs/1711.07128 diff --git a/models/object_detection/ssd_mobilenet_v1/tflite_fp32/definition.yaml b/models/object_detection/ssd_mobilenet_v1/tflite_fp32/definition.yaml new file mode 100644 index 0000000..b5a3d50 --- /dev/null +++ b/models/object_detection/ssd_mobilenet_v1/tflite_fp32/definition.yaml @@ -0,0 +1,51 @@ +benchmark: + coco_validation_2017: + mAP: 0.21 +description: SSD MobileNet v1 is a object detection network, that localizes and identifies + objects in an input image. This is a TF Lite floating point version that takes a + 300x300 input image and outputs detections for this image. This model is trained + by Google. +keywords: Object detection +license: +- Apache-2.0 +network: + file_size_bytes: 27286108 + filename: ssd_mobilenet_v1.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 5bd511fc17ec7bfe9cd0f51bdec1537b874f52d2 + provenance: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz +network_parameters: + input_nodes: + - description: A float input image. + name: normalized_input_image_tensor + shape: + - 1 + - 300 + - 300 + - 3 + output_nodes: + - description: An array of num_detection box boundaries for each input in the format + (y1, x1, y2, x2) scaled from 0 to 1. + name: TFLite_Detection_PostProcess + shape: [] + - description: COCO detection classes for each object. 0=person, 10=fire hydrant. + name: TFLite_Detection_PostProcess:1 + shape: [] + - description: Detection scores for each object. + name: TFLite_Detection_PostProcess:2 + shape: [] + - description: The number of objects detected in each image. + name: TFLite_Detection_PostProcess:3 + shape: [] +operators: + TensorFlow Lite: + - CONCATENATION + - CONV_2D + - CUSTOM + - DEPTHWISE_CONV_2D + - LOGISTIC + - RELU6 + - RESHAPE +paper: https://arxiv.org/abs/1512.02325 diff --git a/models/object_detection/ssd_mobilenet_v1/tflite_uint8/definition.yaml b/models/object_detection/ssd_mobilenet_v1/tflite_uint8/definition.yaml new file mode 100644 index 0000000..25798c1 --- /dev/null +++ b/models/object_detection/ssd_mobilenet_v1/tflite_uint8/definition.yaml @@ -0,0 +1,50 @@ +benchmark: + coco_validation_2017: + mAP: 0.18 +description: SSD MobileNet v1 is a object detection network, that localizes and identifies + objects in an input image. This is a TF Lite quantized version that takes a 300x300 + input image and outputs detections for this image. This model is trained and quantized + by Google. +keywords: Object detection +license: +- Apache-2.0 +network: + file_size_bytes: 6898880 + filename: ssd_mobilenet_v1.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: 1f9c945db9e32c33e5b91539f756a8fbef636405 + provenance: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18.tar.gz +network_parameters: + input_nodes: + - description: Input RGB images (a range of 0-255 per RGB channel). + name: image_tensor + shape: + - 1 + - 300 + - 300 + - 3 + output_nodes: + - description: The y1, x1, y2, x2 coordinates of the bounding boxes for each detection + name: TFLite_Detection_PostProcess + shape: [] + - description: The class of each detection + name: TFLite_Detection_PostProcess:1 + shape: [] + - description: The probability score for each classification + name: TFLite_Detection_PostProcess:2 + shape: [] + - description: A vector containing a number corresponding to the number of detections + name: TFLite_Detection_PostProcess:3 + shape: [] +operators: + TensorFlow Lite: + - CONCATENATION + - CONV_2D + - CUSTOM + - DEPTHWISE_CONV_2D + - LOGISTIC + - RELU6 + - RESHAPE +paper: https://arxiv.org/abs/1512.02325 diff --git a/models/object_detection/yolo_v3_tiny/tflite_fp32/definition.yaml b/models/object_detection/yolo_v3_tiny/tflite_fp32/definition.yaml new file mode 100644 index 0000000..24f5ee2 --- /dev/null +++ b/models/object_detection/yolo_v3_tiny/tflite_fp32/definition.yaml @@ -0,0 +1,50 @@ +benchmark: + MS COCO Validation: + mAP: 0.331 +description: Yolo v3 Tiny is a object detection network, that localizes and identifies + objects in an input image. This is a floating point version that takes a 416x416 + input image and outputs detections for this image. This model is generated using + the weights from the [https://pjreddie.com/darknet/yolo/](YOLO website). +license: +- Apache-2.0 +network: + file_size_bytes: 35455980 + filename: yolo_v3_tiny_darknet_fp32.tflite + framework: TensorFlow Lite + hash: + algorithm: sha1 + value: b38f7be6856eed4466493bdc86be1879f4b743fb + provenance: https://pjreddie.com/media/files/yolov3-tiny.weights & https://github.com/mystic123/tensorflow-yolo-v3 +network_parameters: + input_nodes: + - description: A 416x416 floating point input image. + name: inputs + shape: + - 1 + - 416 + - 416 + - 3 + output_nodes: + - description: A 1xNx85 map of predictions, where the first 4 entries of the 3rd + dimension are the bounding box coordinates and the 5th is the confidence. The + remaining entries are softmax scores for each class. + name: output_boxes + shape: + - 1 + - 2535 + - 85 +operators: + TensorFlow Lite: + - ADD + - CONCATENATION + - CONV_2D + - EXP + - LOGISTIC + - MAXIMUM + - MAX_POOL_2D + - MUL + - RESHAPE + - RESIZE_NEAREST_NEIGHBOR + - SPLIT_V + - SUB +paper: https://arxiv.org/abs/1804.02767