prithivMLmods commited on
Commit
3fb712f
·
verified ·
1 Parent(s): 560bccc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +360 -1
README.md CHANGED
@@ -2,4 +2,363 @@
2
  license: apache-2.0
3
  datasets:
4
  - Bruece/reclip_domainnet_126_clipart
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - Bruece/reclip_domainnet_126_clipart
5
+ ---
6
+ # **Clipart-126-DomainNet**
7
+
8
+ > **Clipart-126-DomainNet** is an image classification vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for a single-label classification task. It is designed to classify clipart images into 126 domain categories using the **SiglipForImageClassification** architecture.
9
+
10
+ ```py
11
+ Classification Report:
12
+ precision recall f1-score support
13
+
14
+ aircraft_carrier 0.8667 0.4643 0.6047 56
15
+ alarm_clock 0.9706 0.8919 0.9296 74
16
+ ant 0.8889 0.8615 0.8750 65
17
+ anvil 0.5984 0.6083 0.6033 120
18
+ asparagus 0.8158 0.6078 0.6966 51
19
+ axe 0.7544 0.5309 0.6232 81
20
+ banana 0.7111 0.5517 0.6214 58
21
+ basket 0.8571 0.8182 0.8372 66
22
+ bathtub 0.7531 0.7821 0.7673 78
23
+ bear 0.9118 0.6458 0.7561 48
24
+ bee 0.9636 0.9636 0.9636 165
25
+ bird 0.8967 0.9529 0.9240 255
26
+ blackberry 0.8082 0.8429 0.8252 70
27
+ blueberry 0.8661 0.8981 0.8818 108
28
+ bottlecap 0.7821 0.8299 0.8053 147
29
+ broccoli 0.8947 0.8947 0.8947 95
30
+ bus 0.9663 0.9348 0.9503 92
31
+ butterfly 0.9333 0.9545 0.9438 132
32
+ cactus 0.9677 0.9091 0.9375 99
33
+ cake 0.8750 0.8099 0.8412 121
34
+ calculator 0.9583 0.5897 0.7302 39
35
+ camel 0.9391 0.9310 0.9351 116
36
+ camera 0.8846 0.8679 0.8762 53
37
+ candle 0.8298 0.8478 0.8387 92
38
+ cannon 0.8551 0.8551 0.8551 69
39
+ canoe 0.8462 0.7432 0.7914 74
40
+ carrot 0.8800 0.7719 0.8224 57
41
+ castle 1.0000 0.8511 0.9195 47
42
+ cat 0.8167 0.7903 0.8033 62
43
+ ceiling_fan 1.0000 0.2000 0.3333 30
44
+ cell_phone 0.7400 0.6491 0.6916 57
45
+ cello 0.8372 0.9114 0.8727 79
46
+ chair 0.8986 0.8378 0.8671 74
47
+ chandelier 0.9617 0.9263 0.9437 190
48
+ coffee_cup 0.8811 0.9389 0.9091 229
49
+ compass 0.9799 0.9012 0.9389 162
50
+ computer 0.7124 0.9045 0.7970 178
51
+ cow 0.9517 0.9718 0.9617 142
52
+ crab 0.8738 0.9000 0.8867 100
53
+ crocodile 0.9778 0.9167 0.9462 144
54
+ cruise_ship 0.8544 0.9072 0.8800 194
55
+ dog 0.8125 0.7761 0.7939 67
56
+ dolphin 0.7680 0.7500 0.7589 128
57
+ dragon 0.9512 0.9176 0.9341 85
58
+ drums 0.8919 0.9635 0.9263 137
59
+ duck 0.8774 0.8447 0.8608 161
60
+ dumbbell 0.9048 0.9500 0.9268 280
61
+ elephant 0.9038 0.8952 0.8995 105
62
+ eyeglasses 0.8636 0.8488 0.8562 291
63
+ feather 0.8564 0.9227 0.8883 181
64
+ fence 0.9211 0.8400 0.8787 125
65
+ fish 0.8963 0.8768 0.8864 138
66
+ flamingo 0.9636 0.9381 0.9507 226
67
+ flower 0.9146 0.9454 0.9298 238
68
+ foot 0.8780 0.8889 0.8834 81
69
+ fork 0.9032 0.9091 0.9061 154
70
+ frog 0.9420 0.9489 0.9455 137
71
+ giraffe 0.9643 0.9153 0.9391 118
72
+ goatee 0.8763 0.9422 0.9081 173
73
+ grapes 0.9114 0.8571 0.8834 84
74
+ guitar 0.9595 0.8554 0.9045 83
75
+ hammer 0.6111 0.7719 0.6822 114
76
+ helicopter 0.9444 0.9533 0.9488 107
77
+ helmet 0.7368 0.8550 0.7915 131
78
+ horse 0.9588 0.9819 0.9702 166
79
+ kangaroo 0.9125 0.8488 0.8795 86
80
+ lantern 0.8254 0.7536 0.7879 69
81
+ laptop 0.8108 0.5000 0.6186 60
82
+ leaf 0.7143 0.3333 0.4545 30
83
+ lion 0.9744 0.8085 0.8837 47
84
+ lipstick 0.7875 0.6632 0.7200 95
85
+ lobster 0.8963 0.9130 0.9046 161
86
+ microphone 0.7925 0.9231 0.8528 91
87
+ monkey 0.9623 0.9027 0.9315 113
88
+ mosquito 0.8636 0.8444 0.8539 45
89
+ mouse 0.9167 0.8333 0.8730 66
90
+ mug 0.8989 0.8163 0.8556 98
91
+ mushroom 0.9429 0.9429 0.9429 105
92
+ onion 0.9365 0.8429 0.8872 140
93
+ panda 1.0000 0.9726 0.9861 73
94
+ peanut 0.5900 0.7195 0.6484 82
95
+ pear 0.7692 0.7246 0.7463 69
96
+ peas 0.8000 0.7429 0.7704 70
97
+ pencil 0.6667 0.0909 0.1600 44
98
+ penguin 0.9717 0.9279 0.9493 111
99
+ pig 0.9551 0.8252 0.8854 103
100
+ pillow 0.6290 0.5571 0.5909 70
101
+ pineapple 0.9846 0.8889 0.9343 72
102
+ potato 0.6038 0.6531 0.6275 98
103
+ power_outlet 0.8636 0.4043 0.5507 47
104
+ purse 0.0000 0.0000 0.0000 27
105
+ rabbit 0.9341 0.8586 0.8947 99
106
+ raccoon 0.8836 0.9021 0.8927 143
107
+ rhinoceros 0.8750 0.9459 0.9091 74
108
+ rifle 0.7595 0.7500 0.7547 80
109
+ saxophone 0.9454 0.9886 0.9665 175
110
+ screwdriver 0.7521 0.6929 0.7213 127
111
+ sea_turtle 0.9677 0.9626 0.9651 187
112
+ see_saw 0.6679 0.8698 0.7556 215
113
+ sheep 0.9355 0.9158 0.9255 95
114
+ shoe 0.8969 0.8700 0.8832 100
115
+ skateboard 0.8632 0.8673 0.8652 211
116
+ snake 0.9302 0.9160 0.9231 131
117
+ speedboat 0.8187 0.8976 0.8563 166
118
+ spider 0.9043 0.9286 0.9163 112
119
+ squirrel 0.7945 0.8855 0.8375 131
120
+ strawberry 0.8687 0.9923 0.9264 260
121
+ streetlight 0.8178 0.9293 0.8700 198
122
+ string_bean 0.8525 0.8000 0.8254 65
123
+ submarine 0.8022 0.8902 0.8439 164
124
+ swan 0.8397 0.9003 0.8690 291
125
+ table 0.8564 0.9200 0.8871 175
126
+ teapot 0.8763 0.9189 0.8971 185
127
+ teddy-bear 0.9006 0.8953 0.8980 172
128
+ television 0.8509 0.8220 0.8362 118
129
+ the_Eiffel_Tower 0.9468 0.9082 0.9271 98
130
+ the_Great_Wall_of_China 0.9462 0.9462 0.9462 93
131
+ tiger 0.9417 0.9826 0.9617 230
132
+ toe 0.8250 0.6600 0.7333 50
133
+ train 0.9362 0.9778 0.9565 90
134
+ truck 0.9367 0.8916 0.9136 83
135
+ umbrella 0.9633 0.9545 0.9589 110
136
+ vase 0.7642 0.8393 0.8000 112
137
+ watermelon 0.9527 0.9527 0.9527 148
138
+ whale 0.7453 0.8144 0.7783 194
139
+ zebra 0.9275 0.9676 0.9471 185
140
+
141
+ accuracy 0.8691 14818
142
+ macro avg 0.8613 0.8251 0.8351 14818
143
+ weighted avg 0.8705 0.8691 0.8661 14818
144
+ ```
145
+
146
+
147
+
148
+ The model categorizes images into the following 126 classes:
149
+ - **Class 0:** "aircraft_carrier"
150
+ - **Class 1:** "alarm_clock"
151
+ - **Class 2:** "ant"
152
+ - **Class 3:** "anvil"
153
+ - **Class 4:** "asparagus"
154
+ - **Class 5:** "axe"
155
+ - **Class 6:** "banana"
156
+ - **Class 7:** "basket"
157
+ - **Class 8:** "bathtub"
158
+ - **Class 9:** "bear"
159
+ - **Class 10:** "bee"
160
+ - **Class 11:** "bird"
161
+ - **Class 12:** "blackberry"
162
+ - **Class 13:** "blueberry"
163
+ - **Class 14:** "bottlecap"
164
+ - **Class 15:** "broccoli"
165
+ - **Class 16:** "bus"
166
+ - **Class 17:** "butterfly"
167
+ - **Class 18:** "cactus"
168
+ - **Class 19:** "cake"
169
+ - **Class 20:** "calculator"
170
+ - **Class 21:** "camel"
171
+ - **Class 22:** "camera"
172
+ - **Class 23:** "candle"
173
+ - **Class 24:** "cannon"
174
+ - **Class 25:** "canoe"
175
+ - **Class 26:** "carrot"
176
+ - **Class 27:** "castle"
177
+ - **Class 28:** "cat"
178
+ - **Class 29:** "ceiling_fan"
179
+ - **Class 30:** "cell_phone"
180
+ - **Class 31:** "cello"
181
+ - **Class 32:** "chair"
182
+ - **Class 33:** "chandelier"
183
+ - **Class 34:** "coffee_cup"
184
+ - **Class 35:** "compass"
185
+ - **Class 36:** "computer"
186
+ - **Class 37:** "cow"
187
+ - **Class 38:** "crab"
188
+ - **Class 39:** "crocodile"
189
+ - **Class 40:** "cruise_ship"
190
+ - **Class 41:** "dog"
191
+ - **Class 42:** "dolphin"
192
+ - **Class 43:** "dragon"
193
+ - **Class 44:** "drums"
194
+ - **Class 45:** "duck"
195
+ - **Class 46:** "dumbbell"
196
+ - **Class 47:** "elephant"
197
+ - **Class 48:** "eyeglasses"
198
+ - **Class 49:** "feather"
199
+ - **Class 50:** "fence"
200
+ - **Class 51:** "fish"
201
+ - **Class 52:** "flamingo"
202
+ - **Class 53:** "flower"
203
+ - **Class 54:** "foot"
204
+ - **Class 55:** "fork"
205
+ - **Class 56:** "frog"
206
+ - **Class 57:** "giraffe"
207
+ - **Class 58:** "goatee"
208
+ - **Class 59:** "grapes"
209
+ - **Class 60:** "guitar"
210
+ - **Class 61:** "hammer"
211
+ - **Class 62:** "helicopter"
212
+ - **Class 63:** "helmet"
213
+ - **Class 64:** "horse"
214
+ - **Class 65:** "kangaroo"
215
+ - **Class 66:** "lantern"
216
+ - **Class 67:** "laptop"
217
+ - **Class 68:** "leaf"
218
+ - **Class 69:** "lion"
219
+ - **Class 70:** "lipstick"
220
+ - **Class 71:** "lobster"
221
+ - **Class 72:** "microphone"
222
+ - **Class 73:** "monkey"
223
+ - **Class 74:** "mosquito"
224
+ - **Class 75:** "mouse"
225
+ - **Class 76:** "mug"
226
+ - **Class 77:** "mushroom"
227
+ - **Class 78:** "onion"
228
+ - **Class 79:** "panda"
229
+ - **Class 80:** "peanut"
230
+ - **Class 81:** "pear"
231
+ - **Class 82:** "peas"
232
+ - **Class 83:** "pencil"
233
+ - **Class 84:** "penguin"
234
+ - **Class 85:** "pig"
235
+ - **Class 86:** "pillow"
236
+ - **Class 87:** "pineapple"
237
+ - **Class 88:** "potato"
238
+ - **Class 89:** "power_outlet"
239
+ - **Class 90:** "purse"
240
+ - **Class 91:** "rabbit"
241
+ - **Class 92:** "raccoon"
242
+ - **Class 93:** "rhinoceros"
243
+ - **Class 94:** "rifle"
244
+ - **Class 95:** "saxophone"
245
+ - **Class 96:** "screwdriver"
246
+ - **Class 97:** "sea_turtle"
247
+ - **Class 98:** "see_saw"
248
+ - **Class 99:** "sheep"
249
+ - **Class 100:** "shoe"
250
+ - **Class 101:** "skateboard"
251
+ - **Class 102:** "snake"
252
+ - **Class 103:** "speedboat"
253
+ - **Class 104:** "spider"
254
+ - **Class 105:** "squirrel"
255
+ - **Class 106:** "strawberry"
256
+ - **Class 107:** "streetlight"
257
+ - **Class 108:** "string_bean"
258
+ - **Class 109:** "submarine"
259
+ - **Class 110:** "swan"
260
+ - **Class 111:** "table"
261
+ - **Class 112:** "teapot"
262
+ - **Class 113:** "teddy-bear"
263
+ - **Class 114:** "television"
264
+ - **Class 115:** "the_Eiffel_Tower"
265
+ - **Class 116:** "the_Great_Wall_of_China"
266
+ - **Class 117:** "tiger"
267
+ - **Class 118:** "toe"
268
+ - **Class 119:** "train"
269
+ - **Class 120:** "truck"
270
+ - **Class 121:** "umbrella"
271
+ - **Class 122:** "vase"
272
+ - **Class 123:** "watermelon"
273
+ - **Class 124:** "whale"
274
+ - **Class 125:** "zebra"
275
+
276
+ # **Run with Transformers🤗**
277
+
278
+ ```python
279
+ !pip install -q transformers torch pillow gradio
280
+ ```
281
+ ```python
282
+ import gradio as gr
283
+ from transformers import AutoImageProcessor, SiglipForImageClassification
284
+ from transformers.image_utils import load_image
285
+ from PIL import Image
286
+ import torch
287
+
288
+ # Load model and processor
289
+ model_name = "prithivMLmods/Clipart-126-DomainNet"
290
+ model = SiglipForImageClassification.from_pretrained(model_name)
291
+ processor = AutoImageProcessor.from_pretrained(model_name)
292
+
293
+ def clipart_classification(image):
294
+ """Predicts the clipart category for an input image."""
295
+ # Convert the input numpy array to a PIL Image and ensure it's in RGB format
296
+ image = Image.fromarray(image).convert("RGB")
297
+
298
+ # Process the image and prepare it for the model
299
+ inputs = processor(images=image, return_tensors="pt")
300
+
301
+ # Perform inference without gradient computation
302
+ with torch.no_grad():
303
+ outputs = model(**inputs)
304
+ logits = outputs.logits
305
+ # Apply softmax to obtain probabilities for each class
306
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
307
+
308
+ # Mapping from indices to clipart category labels
309
+ labels = {
310
+ "0": "aircraft_carrier", "1": "alarm_clock", "2": "ant", "3": "anvil", "4": "asparagus",
311
+ "5": "axe", "6": "banana", "7": "basket", "8": "bathtub", "9": "bear",
312
+ "10": "bee", "11": "bird", "12": "blackberry", "13": "blueberry", "14": "bottlecap",
313
+ "15": "broccoli", "16": "bus", "17": "butterfly", "18": "cactus", "19": "cake",
314
+ "20": "calculator", "21": "camel", "22": "camera", "23": "candle", "24": "cannon",
315
+ "25": "canoe", "26": "carrot", "27": "castle", "28": "cat", "29": "ceiling_fan",
316
+ "30": "cell_phone", "31": "cello", "32": "chair", "33": "chandelier", "34": "coffee_cup",
317
+ "35": "compass", "36": "computer", "37": "cow", "38": "crab", "39": "crocodile",
318
+ "40": "cruise_ship", "41": "dog", "42": "dolphin", "43": "dragon", "44": "drums",
319
+ "45": "duck", "46": "dumbbell", "47": "elephant", "48": "eyeglasses", "49": "feather",
320
+ "50": "fence", "51": "fish", "52": "flamingo", "53": "flower", "54": "foot",
321
+ "55": "fork", "56": "frog", "57": "giraffe", "58": "goatee", "59": "grapes",
322
+ "60": "guitar", "61": "hammer", "62": "helicopter", "63": "helmet", "64": "horse",
323
+ "65": "kangaroo", "66": "lantern", "67": "laptop", "68": "leaf", "69": "lion",
324
+ "70": "lipstick", "71": "lobster", "72": "microphone", "73": "monkey", "74": "mosquito",
325
+ "75": "mouse", "76": "mug", "77": "mushroom", "78": "onion", "79": "panda",
326
+ "80": "peanut", "81": "pear", "82": "peas", "83": "pencil", "84": "penguin",
327
+ "85": "pig", "86": "pillow", "87": "pineapple", "88": "potato", "89": "power_outlet",
328
+ "90": "purse", "91": "rabbit", "92": "raccoon", "93": "rhinoceros", "94": "rifle",
329
+ "95": "saxophone", "96": "screwdriver", "97": "sea_turtle", "98": "see_saw", "99": "sheep",
330
+ "100": "shoe", "101": "skateboard", "102": "snake", "103": "speedboat", "104": "spider",
331
+ "105": "squirrel", "106": "strawberry", "107": "streetlight", "108": "string_bean",
332
+ "109": "submarine", "110": "swan", "111": "table", "112": "teapot", "113": "teddy-bear",
333
+ "114": "television", "115": "the_Eiffel_Tower", "116": "the_Great_Wall_of_China",
334
+ "117": "tiger", "118": "toe", "119": "train", "120": "truck", "121": "umbrella",
335
+ "122": "vase", "123": "watermelon", "124": "whale", "125": "zebra"
336
+ }
337
+
338
+ # Create a dictionary mapping each label to its corresponding probability (rounded)
339
+ predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
340
+ return predictions
341
+
342
+ # Create Gradio interface
343
+ iface = gr.Interface(
344
+ fn=clipart_classification,
345
+ inputs=gr.Image(type="numpy"),
346
+ outputs=gr.Label(label="Prediction Scores"),
347
+ title="Clipart-126-DomainNet Classification",
348
+ description="Upload a clipart image to classify it into one of 126 domain categories."
349
+ )
350
+
351
+ # Launch the app
352
+ if __name__ == "__main__":
353
+ iface.launch()
354
+ ```
355
+ ---
356
+
357
+ # **Intended Use:**
358
+
359
+ The **Clipart-126-DomainNet** model is designed for clipart image classification. It categorizes clipart images into a wide range of domains—from objects like an "aircraft_carrier" or "alarm_clock" to various everyday items. Potential use cases include:
360
+
361
+ - **Digital Art and Design:** Assisting designers in organizing and retrieving clipart assets.
362
+ - **Content Management:** Enhancing digital asset management systems with robust clipart classification.
363
+ - **Creative Search Engines:** Enabling clipart-based search for design inspiration and resource curation.
364
+ - **Computer Vision Research:** Serving as a benchmark for studies in clipart recognition and domain adaptation.