jemartin commited on
Commit
53bbb53
·
verified ·
1 Parent(s): 5ce21b7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +215 -0
README.md ADDED
@@ -0,0 +1,215 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ model_name: MaskRCNN-10.onnx
5
+ tags:
6
+ - validated
7
+ - vision
8
+ - object_detection_segmentation
9
+ - mask-rcnn
10
+ ---
11
+ <!--- SPDX-License-Identifier: MIT -->
12
+
13
+ # Mask R-CNN
14
+
15
+ ## Description
16
+ This model is a real-time neural network for object instance segmentation that detects 80 different [classes](dependencies/coco_classes.txt).
17
+
18
+ ## Model
19
+
20
+ |Model |Download | Download (with sample test data)|ONNX version|Opset version|Accuracy |
21
+ |-------------|:--------------|:--------------|:--------------|:--------------|:--------------|
22
+ |Mask R-CNN R-50-FPN |[177.9 MB](model/MaskRCNN-10.onnx) | [168.8 MB](model/MaskRCNN-10.tar.gz) |1.5 |10 |mAP of 0.36 & 0.33 |
23
+ |Mask R-CNN R-50-FPN-fp32 |[169.7 MB](model/MaskRCNN-12.onnx) | [157.3 MB](model/MaskRCNN-12.tar.gz) |1.9 |12 |mAP of 0.3372 |
24
+ |Mask R-CNN R-50-FPN-int8 |[44 MB](model/MaskRCNN-12-int8.onnx) | [38 MB](model/MaskRCNN-12-int8.tar.gz) |1.9 |12 |mAP of 0.3314 |
25
+ |Mask R-CNN R-50-FPN-qdq |[44 MB](model/MaskRCNN-12-qdq.onnx) | [30 MB](model/MaskRCNN-12-qdq.tar.gz) |1.9 |12 |mAP of 0.3328 |
26
+ > Compared with the Mask R-CNN R-50-FPN-fp32, Mask R-CNN R-50-FPN-int8's mAP decline is 0.0058 and performance improvement is 1.99x.
27
+ >
28
+ > Note the performance depends on the test hardware.
29
+ >
30
+ > Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
31
+
32
+
33
+ <hr>
34
+
35
+ ## Inference
36
+
37
+ ### Input to model
38
+ Image shape `(3x'height'x'width')`
39
+
40
+ ### Preprocessing steps
41
+ The images have to be loaded in to a range of [0, 255], resized, converted to BGR and then normalized using mean = [102.9801, 115.9465, 122.7717]. The transformation should preferably happen at preprocessing.
42
+
43
+ This model can take images of different sizes as input. However, to achieve best performance, it is recommended to resize the image such that both height and width are within the range of [800, 1333], and then pad the image with zeros such that both height and width are divisible by 32.
44
+
45
+ The following code shows how to preprocess the [demo image](dependencies/demo.jpg):
46
+
47
+ ```python
48
+ import numpy as np
49
+ from PIL import Image
50
+
51
+ def preprocess(image):
52
+ # Resize
53
+ ratio = 800.0 / min(image.size[0], image.size[1])
54
+ image = image.resize((int(ratio * image.size[0]), int(ratio * image.size[1])), Image.BILINEAR)
55
+
56
+ # Convert to BGR
57
+ image = np.array(image)[:, :, [2, 1, 0]].astype('float32')
58
+
59
+ # HWC -> CHW
60
+ image = np.transpose(image, [2, 0, 1])
61
+
62
+ # Normalize
63
+ mean_vec = np.array([102.9801, 115.9465, 122.7717])
64
+ for i in range(image.shape[0]):
65
+ image[i, :, :] = image[i, :, :] - mean_vec[i]
66
+
67
+ # Pad to be divisible of 32
68
+ import math
69
+ padded_h = int(math.ceil(image.shape[1] / 32) * 32)
70
+ padded_w = int(math.ceil(image.shape[2] / 32) * 32)
71
+
72
+ padded_image = np.zeros((3, padded_h, padded_w), dtype=np.float32)
73
+ padded_image[:, :image.shape[1], :image.shape[2]] = image
74
+ image = padded_image
75
+
76
+ return image
77
+
78
+ img = Image.open('dependencies/demo.jpg')
79
+ img_data = preprocess(img)
80
+ ```
81
+
82
+ ### Output of model
83
+ The model has 4 outputs.
84
+
85
+ boxes: `('nbox'x4)`, in `(xmin, ymin, xmax, ymax)`.
86
+
87
+ labels: `('nbox')`.
88
+
89
+ scores: `('nbox')`.
90
+
91
+ masks: `('nbox', 1, 28, 28)`.
92
+
93
+ ### Postprocessing steps
94
+
95
+ The following code shows how to patch the original image with detections, class annotations and segmentation, filtered by scores:
96
+
97
+ ```python
98
+ import matplotlib.pyplot as plt
99
+ import matplotlib.patches as patches
100
+
101
+ import pycocotools.mask as mask_util
102
+ import cv2
103
+
104
+ classes = [line.rstrip('\n') for line in open('coco_classes.txt')]
105
+
106
+ def display_objdetect_image(image, boxes, labels, scores, masks, score_threshold=0.7):
107
+ # Resize boxes
108
+ ratio = 800.0 / min(image.size[0], image.size[1])
109
+ boxes /= ratio
110
+
111
+ _, ax = plt.subplots(1, figsize=(12,9))
112
+
113
+ image = np.array(image)
114
+
115
+ for mask, box, label, score in zip(masks, boxes, labels, scores):
116
+ # Showing boxes with score > 0.7
117
+ if score <= score_threshold:
118
+ continue
119
+
120
+ # Finding contour based on mask
121
+ mask = mask[0, :, :, None]
122
+ int_box = [int(i) for i in box]
123
+ mask = cv2.resize(mask, (int_box[2]-int_box[0]+1, int_box[3]-int_box[1]+1))
124
+ mask = mask > 0.5
125
+ im_mask = np.zeros((image.shape[0], image.shape[1]), dtype=np.uint8)
126
+ x_0 = max(int_box[0], 0)
127
+ x_1 = min(int_box[2] + 1, image.shape[1])
128
+ y_0 = max(int_box[1], 0)
129
+ y_1 = min(int_box[3] + 1, image.shape[0])
130
+ mask_y_0 = max(y_0 - box[1], 0)
131
+ mask_y_1 = mask_y_0 + y_1 - y_0
132
+ mask_x_0 = max(x_0 - box[0], 0)
133
+ mask_x_1 = mask_x_0 + x_1 - x_0
134
+ im_mask[y_0:y_1, x_0:x_1] = mask[
135
+ mask_y_0 : mask_y_1, mask_x_0 : mask_x_1
136
+ ]
137
+ im_mask = im_mask[:, :, None]
138
+
139
+ # OpenCV version 4.x
140
+ contours, hierarchy = cv2.findContours(
141
+ im_mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE
142
+ )
143
+
144
+ image = cv2.drawContours(image, contours, -1, 25, 3)
145
+
146
+ rect = patches.Rectangle((box[0], box[1]), box[2] - box[0], box[3] - box[1], linewidth=1, edgecolor='b', facecolor='none')
147
+ ax.annotate(classes[label] + ':' + str(np.round(score, 2)), (box[0], box[1]), color='w', fontsize=12)
148
+ ax.add_patch(rect)
149
+
150
+ ax.imshow(image)
151
+ plt.show()
152
+
153
+ display_objdetect_image(img, boxes, labels, scores, masks)
154
+ ```
155
+
156
+
157
+
158
+ ## Dataset (Train and validation)
159
+ The original pretrained Mask R-CNN model is from [facebookresearch/maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark), compute mAP the same as [Detectron](https://github.com/facebookresearch/Detectron) on `coco_2014_minival` dataset from COCO, which is exactly equivalent to the `coco_2017_val` dataset.
160
+ <hr>
161
+
162
+ ## Validation accuracy
163
+ Mask R-CNN R-50-FPN:
164
+ Metric is COCO mAP (averaged over IoU of 0.5:0.95), computed over 2017 COCO val data.
165
+ box mAP of 0.361, and mask mAP of 0.327.
166
+
167
+ Mask R-CNN R-50-FPN-fp32 & Mask R-CNN R-50-FPN-int8:
168
+ Metric is COCO box mAP@[IoU=0.50:0.95 | area=all | maxDets=100], computed over 2017 COCO val data.
169
+ <hr>
170
+
171
+ ## Quantization
172
+ Mask R-CNN R-50-FPN-int8 and Mask R-CNN R-50-FPN-qdq are obtained by quantizing Mask R-CNN R-50-FPN-fp32 model. We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/object_detection/onnx_model_zoo/mask_rcnn/quantization/ptq/README.md) to understand how to use Intel® Neural Compressor for quantization.
173
+
174
+ ### Environment
175
+ onnx: 1.9.0
176
+ onnxruntime: 1.10.0
177
+
178
+ ### Prepare model
179
+ ```shell
180
+ wget https://github.com/onnx/models/raw/main/vision/object_detection_segmentation/mask-rcnn/model/MaskRCNN-12.onnx
181
+ ```
182
+
183
+ ### Model quantize
184
+ ```bash
185
+ bash run_tuning.sh --input_model=path/to/model \ # model path as *.onnx
186
+ --config=mask_rcnn.yaml \
187
+ --data_path=path/to/COCO2017 \
188
+ --output_model=path/to/save
189
+ ```
190
+ <hr>
191
+
192
+ ## Publication/Attribution
193
+ Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask R-CNN. IEEE International Conference on Computer Vision (ICCV), 2017.
194
+
195
+ Massa, Francisco and Girshick, Ross. maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. [facebookresearch/maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark).
196
+ <hr>
197
+
198
+ ## References
199
+ * This model is converted from [facebookresearch/maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) with modifications in [repository](https://github.com/BowenBao/maskrcnn-benchmark/tree/onnx_stage).
200
+
201
+ * [Intel® Neural Compressor](https://github.com/intel/neural-compressor)
202
+ <hr>
203
+
204
+ ## Contributors
205
+ * [mengniwang95](https://github.com/mengniwang95) (Intel)
206
+ * [yuwenzho](https://github.com/yuwenzho) (Intel)
207
+ * [airMeng](https://github.com/airMeng) (Intel)
208
+ * [ftian1](https://github.com/ftian1) (Intel)
209
+ * [hshen14](https://github.com/hshen14) (Intel)
210
+ <hr>
211
+
212
+ ## License
213
+ MIT License
214
+ <hr>
215
+