blesot commited on
Commit
2ad76c4
·
1 Parent(s): c5d90e4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Hugging Face's logo
2
+ ---
3
+ language:
4
+ - om
5
+ - am
6
+ - rw
7
+ - rn
8
+ - ha
9
+ - ig
10
+ - pcm
11
+ - so
12
+ - sw
13
+ - ti
14
+ - yo
15
+ - multilingual
16
+
17
+ ---
18
+
19
+ # Mask R-CNN
20
+
21
+ ## Model desription
22
+
23
+ Mask R-CNN is a model that extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. The model locates pixels of images instead of just bounding boxes as Faster R-CNN was not designed for pixel-to-pixel alignment between network inputs and outputs.
24
+
25
+ ### More information on the model and dataset:
26
+
27
+ ### The model
28
+ Mask R-CNN works towards the approach of instance segmentation, which involves object detection, and semantic segmentation. For object detection, Mask R-CNN uses an architecture that is similar to Faster R-CNN, while it uses a Fully Convolutional Network(FCN) for semantic segmentation.
29
+ The FCN is added to the top of features of a Faster R-CNN to generate a mask segmentation output. This segmentation output is in parallel with the classification and bounding box regressor network of the Faster R-CNN model. From the advancement of Fast R-CNN Region of Interest Pooling(ROI), Mask R-CNN adds refinement called ROI aligning by addressing the loss and misalignment of ROI Pooling; the new ROI aligned leads to improved results.
30
+
31
+ ### Technical Specifications
32
+
33
+ Please [read the paper](https://arxiv.org/pdf/1703.06870.pdf) for more information on training.
34
+
35
+ The model architecture is divided into two parts:
36
+ - Region proposal network (RPN) to propose candidate object bounding boxes.
37
+ - Binary mask classifier to generate a mask for every class
38
+
39
+ #### Technical Summary.
40
+ - Mask R-CNN is quite similar to the structure of faster R-CNN.
41
+ - Outputs a binary mask for each Region of Interest.
42
+ - Applies bounding-box classification and regression in parallel, simplifying the original R-CNN's multi-stage pipeline.
43
+ - The network architectures utilized are called ResNet and ResNeXt. The depth can be either 50 or 101
44
+
45
+ #### Results Summary
46
+ - Instance Segmentation: Based on the COCO dataset, Mask R-CNN outperforms all categories compared to MNC and FCIS, which are state-of-the-art model.
47
+ - Bounding Box Detection: Mask R-CNN outperforms the base variants of all previous state-of-the-art models, including the COCO 2016 Detection Challenge winner.
48
+
49
+
50
+ ## Intended uses & limitations
51
+
52
+ - With great generality, Mask RCNN can be extended to human pose estimation.
53
+
54
+
55
+
56
+ ## Training Procedure
57
+ Please [read the paper](https://arxiv.org/pdf/1703.06870.pdf) for more information on training, or check OpenMMLab [repository](https://github.com/open-mmlab/mmdetection/tree/master/configs/mask_rcnn)