yichengup commited on
Commit
a35386c
·
verified ·
1 Parent(s): 258c224

Upload deeplabv3p-resnet50-human read.md

Browse files
Files changed (1) hide show
  1. deeplabv3p-resnet50-human read.md +101 -0
deeplabv3p-resnet50-human read.md ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc0-1.0
3
+ tags:
4
+ - art
5
+ - computer vision
6
+ - Image segmentation
7
+ ---
8
+
9
+ # DeepLabV3+ ResNet50 for human body parts segmentation
10
+
11
+ This is a very simple ONNX model that can segment human body parts.
12
+
13
+ ## Why this model
14
+
15
+ This model is a ONNX transposition of [keras-io/deeplabv3p-resnet50](https://huggingface.co/keras-io/deeplabv3p-resnet50)
16
+ where the provided model can segment human body parts. All the others models that I found was trained on
17
+ city segmentation.
18
+
19
+ The original model is built for old version of Keras and cannot be used with recent version of TensorFlow.
20
+ I translated the model to ONNX format.
21
+
22
+ ## Usage
23
+
24
+ Get the `deeplabv3p-resnet50-human.onnx` file and use it with ONNXRuntime package.
25
+
26
+ The result of `model.run` is a `(1, 1, 512, 512, 20)` tensor:
27
+
28
+ - 1: number of output (you can squeeze it)
29
+ - 1: batch size (you can squeeze it)
30
+ - 512, 512: the size of the image (fixed)
31
+ - 20: number of classes, so you can take the `argmax`` of the tensor to get the class of each pixel
32
+
33
+ ```python
34
+ import onnxruntime
35
+ import numpy as np
36
+ from PIL import Image
37
+
38
+ model = onnxruntime.InferenceSession("deeplabv3p-resnet50-human.onnx")
39
+
40
+ img = Image.open(sys.argv[1] if len(sys.argv) > 1 else "image.jpg")
41
+ img = img.resize((512, 512))
42
+ img = np.array(img).astype(np.float32) / 127.5 - 1
43
+
44
+ # infer
45
+ input_name = model.get_inputs()[0].name
46
+ output_name = model.get_outputs()[0].name
47
+ result = model.run([output_name], {input_name: img})
48
+
49
+ # squeeze, argmax...
50
+ result = np.array(result[0])
51
+ # argmax the classes, remove the batch size
52
+ result = result.argmax(axis=3).squeeze(0)
53
+
54
+ # get the masks
55
+ for i in range(20):
56
+ detected = result == i # get the detected pixels for the class i
57
+ # detected is a 512, 512 boolean array
58
+ mask = np.zeros_like(img)
59
+ mask[detected] = 255
60
+ Image.fromarray(mask).show() # or save, or return the mask...
61
+ ```
62
+
63
+ ## Classes index
64
+
65
+ This is the list of classes that the model can detect (some classes are not specifically identified, see below):
66
+
67
+ - 0: "background",
68
+ - 1: "unknown",
69
+ - 2: "hair",
70
+ - 3: "unknown",
71
+ - 4: "glasses",
72
+ - 5: "top-clothes",
73
+ - 6: "unknown",
74
+ - 7: "unknown",
75
+ - 8: "unknown",
76
+ - 9: "bottom-clothes",
77
+ - 10: "torso-skin",
78
+ - 11: "unknown",
79
+ - 12: "unknown",
80
+ - 13: "face",
81
+ - 14: "left-arm",
82
+ - 15: "right-arm",
83
+ - 16: "left-leg",
84
+ - 17: "right-leg",
85
+ - 18: "left-foot",
86
+ - 19: "right-foot",
87
+
88
+ ## Known limitation
89
+
90
+ - The model could fail on portrait images, because the model was trained on "full body" images.
91
+ - There are some classes that I don't know what they are. I can't find the list of classes (help !).
92
+ - The model is not perfect, and can fail on some images. I'm not the author of the model, so I can't fix it.
93
+
94
+ ## License
95
+
96
+ The [original model card](https://huggingface.co/keras-io/deeplabv3p-resnet50/blob/main/README.md) proposes the "CC0-1.0"
97
+ license. I don't know if it's the right license for the model, but I keep it.
98
+
99
+ > Anyway, thanks to the authors of the model for sharing it and to leave it open to use.
100
+
101
+ This means that you may use the model, share, modify, and distribute it without any restriction.