onnxmodelzoo
/

yolov3-12

+---
+language: en
+license: apache-2.0
+model_name: yolov3-12.onnx
+tags:
+- validated
+- vision
+- object_detection_segmentation
+- yolov3
+---
+<!--- SPDX-License-Identifier: MIT -->
+# YOLOv3
+## Description
+This model is a neural network for real-time object detection that detects 80 different classes. It is very fast and accurate.
+## Model
+|Model        |Download  |Download (with sample test data)|ONNX version|Opset version|Accuracy |
+|-------------|:--------------|:--------------|:--------------|:--------------|:--------------|
+|YOLOv3       |[237 MB](model/yolov3-10.onnx) |[222 MB](model/yolov3-10.tar.gz)|1.5 |10 |mAP of 0.553 |
+|YOLOv3-12    |[237 MB](model/yolov3-12.onnx) |[222 MB](model/yolov3-12.tar.gz)|1.9 |12 |mAP of 0.2874 |
+|YOLOv3-12-int8 |[60 MB](model/yolov3-12-int8.onnx) |[53 MB](model/yolov3-12-int8.tar.gz)|1.9 |12 |mAP of 0.2693 |
+> Compared with the YOLOv3-12, YOLOv3-12-int8's mAP decline is 0.0181 and performance improvement is 2.19x.
+>
+> Note the performance depends on the test hardware.
+>
+> Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
+<hr>
+## Inference
+### Input to model
+Resized image `(1x3x416x416)`
+Original image size `(1x2)` which is `[image.size[1], image.size[0]]`
+### Preprocessing steps
+The images have to be loaded in to a range of [0, 1]. The transformation should preferrably happen at preprocessing.
+The following code shows how to preprocess a NCHW tensor:
+```python
+import numpy as np
+from PIL import Image
+# this function is from yolo3.utils.letterbox_image
+def letterbox_image(image, size):
+'''resize image with unchanged aspect ratio using padding'''
+iw, ih = image.size
+w, h = size
+scale = min(w/iw, h/ih)
+nw = int(iw*scale)
+nh = int(ih*scale)
+image = image.resize((nw,nh), Image.BICUBIC)
+new_image = Image.new('RGB', size, (128,128,128))
+new_image.paste(image, ((w-nw)//2, (h-nh)//2))
+return new_image
+def preprocess(img):
+model_image_size = (416, 416)
+boxed_image = letterbox_image(img, tuple(reversed(model_image_size)))
+image_data = np.array(boxed_image, dtype='float32')
+image_data /= 255.
+image_data = np.transpose(image_data, [2, 0, 1])
+image_data = np.expand_dims(image_data, 0)
+return image_data
+image = Image.open(img_path)
+# input
+image_data = preprocess(image)
+image_size = np.array([image.size[1], image.size[0]], dtype=np.int32).reshape(1, 2)
+```
+### Output of model
+The model has 3 outputs.
+boxes: `(1x'n_candidates'x4)`, the coordinates of all anchor boxes,
+scores: `(1x80x'n_candidates')`, the scores of all anchor boxes per class,
+indices: `('nbox'x3)`, selected indices from the boxes tensor. The selected index format is (batch_index, class_index, box_index). The class list is [here](https://github.com/qqwweee/keras-yolo3/blob/master/model_data/coco_classes.txt)
+### Postprocessing steps
+Post processing and meaning of output
+```
+out_boxes, out_scores, out_classes = [], [], []
+for idx_ in indices:
+out_classes.append(idx_[1])
+out_scores.append(scores[tuple(idx_)])
+idx_1 = (idx_[0], idx_[2])
+out_boxes.append(boxes[idx_1])
+```
+out_boxes, out_scores, out_classes are list of resulting boxes, scores, and classes.
+<hr>
+## Dataset (Train and validation)
+We use pretrained weights from pjreddie.com [here](https://pjreddie.com/media/files/yolov3.weights).
+<hr>
+## Validation accuracy
+YOLOv3:
+Metric is COCO box mAP (averaged over IoU of 0.5:0.95), computed over 2017 COCO val data.
+mAP of 0.553 based on original Yolov3 model [here](https://pjreddie.com/darknet/yolo/)
+YOLOv3-12 & YOLOv3-12-int8:
+Metric is COCO box mAP@[IoU=0.50:0.95 | area=all | maxDets=100], computed over 2017 COCO val data.
+<hr>
+## Quantization
+YOLOv3-12-int8 is obtained by quantizing YOLOv3-12 model. We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/object_detection/onnx_model_zoo/yolov3/quantization/ptq/README.md) to understand how to use Intel® Neural Compressor for quantization.
+### Environment
+onnx: 1.9.0
+onnxruntime: 1.10.0
+### Prepare model
+```shell
+wget https://github.com/onnx/models/raw/main/vision/object_detection_segmentation/yolov3/model/yolov3-12.onnx
+```
+### Model quantize
+```bash
+bash run_tuning.sh --input_model=path/to/model \  # model path as *.onnx
+--config=yolov3.yaml \
+--data_path=path/to/COCO2017 \
+--output_model=path/to/save
+```
+<hr>
+## Publication/Attribution
+Joseph Redmon, Ali Farhadi. YOLOv3: An Incremental Improvement, [paper](https://pjreddie.com/media/files/papers/YOLOv3.pdf)
+<hr>
+## References
+* This model is converted from a keras model [repository](https://github.com/qqwweee/keras-yolo3) using keras2onnx converter [repository](https://github.com/onnx/keras-onnx).
+* [Intel® Neural Compressor](https://github.com/intel/neural-compressor)
+<hr>
+## Contributors
+* [mengniwang95](https://github.com/mengniwang95) (Intel)
+* [airMeng](https://github.com/airMeng) (Intel)
+* [ftian1](https://github.com/ftian1) (Intel)
+* [hshen14](https://github.com/hshen14) (Intel)
+<hr>
+## License
+MIT License
+<hr>