Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,151 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
model_name: yolov3-12.onnx
|
| 5 |
+
tags:
|
| 6 |
+
- validated
|
| 7 |
+
- vision
|
| 8 |
+
- object_detection_segmentation
|
| 9 |
+
- yolov3
|
| 10 |
+
---
|
| 11 |
+
<!--- SPDX-License-Identifier: MIT -->
|
| 12 |
+
|
| 13 |
+
# YOLOv3
|
| 14 |
+
|
| 15 |
+
## Description
|
| 16 |
+
This model is a neural network for real-time object detection that detects 80 different classes. It is very fast and accurate.
|
| 17 |
+
|
| 18 |
+
## Model
|
| 19 |
+
|
| 20 |
+
|Model |Download |Download (with sample test data)|ONNX version|Opset version|Accuracy |
|
| 21 |
+
|-------------|:--------------|:--------------|:--------------|:--------------|:--------------|
|
| 22 |
+
|YOLOv3 |[237 MB](model/yolov3-10.onnx) |[222 MB](model/yolov3-10.tar.gz)|1.5 |10 |mAP of 0.553 |
|
| 23 |
+
|YOLOv3-12 |[237 MB](model/yolov3-12.onnx) |[222 MB](model/yolov3-12.tar.gz)|1.9 |12 |mAP of 0.2874 |
|
| 24 |
+
|YOLOv3-12-int8 |[60 MB](model/yolov3-12-int8.onnx) |[53 MB](model/yolov3-12-int8.tar.gz)|1.9 |12 |mAP of 0.2693 |
|
| 25 |
+
> Compared with the YOLOv3-12, YOLOv3-12-int8's mAP decline is 0.0181 and performance improvement is 2.19x.
|
| 26 |
+
>
|
| 27 |
+
> Note the performance depends on the test hardware.
|
| 28 |
+
>
|
| 29 |
+
> Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
<hr>
|
| 33 |
+
|
| 34 |
+
## Inference
|
| 35 |
+
|
| 36 |
+
### Input to model
|
| 37 |
+
Resized image `(1x3x416x416)`
|
| 38 |
+
Original image size `(1x2)` which is `[image.size[1], image.size[0]]`
|
| 39 |
+
|
| 40 |
+
### Preprocessing steps
|
| 41 |
+
The images have to be loaded in to a range of [0, 1]. The transformation should preferrably happen at preprocessing.
|
| 42 |
+
|
| 43 |
+
The following code shows how to preprocess a NCHW tensor:
|
| 44 |
+
|
| 45 |
+
```python
|
| 46 |
+
import numpy as np
|
| 47 |
+
from PIL import Image
|
| 48 |
+
|
| 49 |
+
# this function is from yolo3.utils.letterbox_image
|
| 50 |
+
def letterbox_image(image, size):
|
| 51 |
+
'''resize image with unchanged aspect ratio using padding'''
|
| 52 |
+
iw, ih = image.size
|
| 53 |
+
w, h = size
|
| 54 |
+
scale = min(w/iw, h/ih)
|
| 55 |
+
nw = int(iw*scale)
|
| 56 |
+
nh = int(ih*scale)
|
| 57 |
+
|
| 58 |
+
image = image.resize((nw,nh), Image.BICUBIC)
|
| 59 |
+
new_image = Image.new('RGB', size, (128,128,128))
|
| 60 |
+
new_image.paste(image, ((w-nw)//2, (h-nh)//2))
|
| 61 |
+
return new_image
|
| 62 |
+
|
| 63 |
+
def preprocess(img):
|
| 64 |
+
model_image_size = (416, 416)
|
| 65 |
+
boxed_image = letterbox_image(img, tuple(reversed(model_image_size)))
|
| 66 |
+
image_data = np.array(boxed_image, dtype='float32')
|
| 67 |
+
image_data /= 255.
|
| 68 |
+
image_data = np.transpose(image_data, [2, 0, 1])
|
| 69 |
+
image_data = np.expand_dims(image_data, 0)
|
| 70 |
+
return image_data
|
| 71 |
+
|
| 72 |
+
image = Image.open(img_path)
|
| 73 |
+
# input
|
| 74 |
+
image_data = preprocess(image)
|
| 75 |
+
image_size = np.array([image.size[1], image.size[0]], dtype=np.int32).reshape(1, 2)
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
### Output of model
|
| 79 |
+
The model has 3 outputs.
|
| 80 |
+
boxes: `(1x'n_candidates'x4)`, the coordinates of all anchor boxes,
|
| 81 |
+
scores: `(1x80x'n_candidates')`, the scores of all anchor boxes per class,
|
| 82 |
+
indices: `('nbox'x3)`, selected indices from the boxes tensor. The selected index format is (batch_index, class_index, box_index). The class list is [here](https://github.com/qqwweee/keras-yolo3/blob/master/model_data/coco_classes.txt)
|
| 83 |
+
|
| 84 |
+
### Postprocessing steps
|
| 85 |
+
Post processing and meaning of output
|
| 86 |
+
```
|
| 87 |
+
out_boxes, out_scores, out_classes = [], [], []
|
| 88 |
+
for idx_ in indices:
|
| 89 |
+
out_classes.append(idx_[1])
|
| 90 |
+
out_scores.append(scores[tuple(idx_)])
|
| 91 |
+
idx_1 = (idx_[0], idx_[2])
|
| 92 |
+
out_boxes.append(boxes[idx_1])
|
| 93 |
+
```
|
| 94 |
+
out_boxes, out_scores, out_classes are list of resulting boxes, scores, and classes.
|
| 95 |
+
<hr>
|
| 96 |
+
|
| 97 |
+
## Dataset (Train and validation)
|
| 98 |
+
We use pretrained weights from pjreddie.com [here](https://pjreddie.com/media/files/yolov3.weights).
|
| 99 |
+
<hr>
|
| 100 |
+
|
| 101 |
+
## Validation accuracy
|
| 102 |
+
YOLOv3:
|
| 103 |
+
Metric is COCO box mAP (averaged over IoU of 0.5:0.95), computed over 2017 COCO val data.
|
| 104 |
+
mAP of 0.553 based on original Yolov3 model [here](https://pjreddie.com/darknet/yolo/)
|
| 105 |
+
|
| 106 |
+
YOLOv3-12 & YOLOv3-12-int8:
|
| 107 |
+
Metric is COCO box mAP@[IoU=0.50:0.95 | area=all | maxDets=100], computed over 2017 COCO val data.
|
| 108 |
+
<hr>
|
| 109 |
+
|
| 110 |
+
## Quantization
|
| 111 |
+
YOLOv3-12-int8 is obtained by quantizing YOLOv3-12 model. We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/object_detection/onnx_model_zoo/yolov3/quantization/ptq/README.md) to understand how to use Intel® Neural Compressor for quantization.
|
| 112 |
+
|
| 113 |
+
### Environment
|
| 114 |
+
onnx: 1.9.0
|
| 115 |
+
onnxruntime: 1.10.0
|
| 116 |
+
|
| 117 |
+
### Prepare model
|
| 118 |
+
```shell
|
| 119 |
+
wget https://github.com/onnx/models/raw/main/vision/object_detection_segmentation/yolov3/model/yolov3-12.onnx
|
| 120 |
+
```
|
| 121 |
+
|
| 122 |
+
### Model quantize
|
| 123 |
+
```bash
|
| 124 |
+
bash run_tuning.sh --input_model=path/to/model \ # model path as *.onnx
|
| 125 |
+
--config=yolov3.yaml \
|
| 126 |
+
--data_path=path/to/COCO2017 \
|
| 127 |
+
--output_model=path/to/save
|
| 128 |
+
```
|
| 129 |
+
<hr>
|
| 130 |
+
|
| 131 |
+
## Publication/Attribution
|
| 132 |
+
Joseph Redmon, Ali Farhadi. YOLOv3: An Incremental Improvement, [paper](https://pjreddie.com/media/files/papers/YOLOv3.pdf)
|
| 133 |
+
|
| 134 |
+
<hr>
|
| 135 |
+
|
| 136 |
+
## References
|
| 137 |
+
* This model is converted from a keras model [repository](https://github.com/qqwweee/keras-yolo3) using keras2onnx converter [repository](https://github.com/onnx/keras-onnx).
|
| 138 |
+
* [Intel® Neural Compressor](https://github.com/intel/neural-compressor)
|
| 139 |
+
<hr>
|
| 140 |
+
|
| 141 |
+
## Contributors
|
| 142 |
+
* [mengniwang95](https://github.com/mengniwang95) (Intel)
|
| 143 |
+
* [airMeng](https://github.com/airMeng) (Intel)
|
| 144 |
+
* [ftian1](https://github.com/ftian1) (Intel)
|
| 145 |
+
* [hshen14](https://github.com/hshen14) (Intel)
|
| 146 |
+
<hr>
|
| 147 |
+
|
| 148 |
+
## License
|
| 149 |
+
MIT License
|
| 150 |
+
<hr>
|
| 151 |
+
|