Image Classification
Transformers
Safetensors
English
siglip
Sketch-126-DomainNet
prithivMLmods commited on
Commit
197059a
·
verified ·
1 Parent(s): 46beeff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +224 -1
README.md CHANGED
@@ -2,10 +2,21 @@
2
  license: apache-2.0
3
  datasets:
4
  - Bruece/domainnet-126-by-class-sketch
 
 
 
 
 
 
 
 
5
  ---
6
 
7
- ![Sketch-126-DomainNet - visual selection.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Rc6Q9-9_nSTV2mRicSqj1.png)
 
 
8
 
 
9
 
10
  ```py
11
  Classification Report:
@@ -143,3 +154,215 @@ the_Great_Wall_of_China 0.6389 0.8440 0.7273 109
143
  weighted avg 0.8404 0.8440 0.8352 19317
144
  ```
145
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - Bruece/domainnet-126-by-class-sketch
5
+ language:
6
+ - en
7
+ base_model:
8
+ - google/siglip2-base-patch16-224
9
+ pipeline_tag: image-classification
10
+ library_name: transformers
11
+ tags:
12
+ - Sketch-126-DomainNet
13
  ---
14
 
15
+ # **Sketch-126-DomainNet**
16
+
17
+ > **Sketch-126-DomainNet** is an image classification vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for a single-label classification task. It is designed to classify sketches into 126 domain categories using the **SiglipForImageClassification** architecture.
18
 
19
+ ![Sketch-126-DomainNet - visual selection.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Rc6Q9-9_nSTV2mRicSqj1.png)
20
 
21
  ```py
22
  Classification Report:
 
154
  weighted avg 0.8404 0.8440 0.8352 19317
155
  ```
156
 
157
+ The model categorizes images into the following 126 classes:
158
+ - **Class 0:** "aircraft_carrier"
159
+ - **Class 1:** "alarm_clock"
160
+ - **Class 2:** "ant"
161
+ - **Class 3:** "anvil"
162
+ - **Class 4:** "asparagus"
163
+ - **Class 5:** "axe"
164
+ - **Class 6:** "banana"
165
+ - **Class 7:** "basket"
166
+ - **Class 8:** "bathtub"
167
+ - **Class 9:** "bear"
168
+ - **Class 10:** "bee"
169
+ - **Class 11:** "bird"
170
+ - **Class 12:** "blackberry"
171
+ - **Class 13:** "blueberry"
172
+ - **Class 14:** "bottlecap"
173
+ - **Class 15:** "broccoli"
174
+ - **Class 16:** "bus"
175
+ - **Class 17:** "butterfly"
176
+ - **Class 18:** "cactus"
177
+ - **Class 19:** "cake"
178
+ - **Class 20:** "calculator"
179
+ - **Class 21:** "camel"
180
+ - **Class 22:** "camera"
181
+ - **Class 23:** "candle"
182
+ - **Class 24:** "cannon"
183
+ - **Class 25:** "canoe"
184
+ - **Class 26:** "carrot"
185
+ - **Class 27:** "castle"
186
+ - **Class 28:** "cat"
187
+ - **Class 29:** "ceiling_fan"
188
+ - **Class 30:** "cell_phone"
189
+ - **Class 31:** "cello"
190
+ - **Class 32:** "chair"
191
+ - **Class 33:** "chandelier"
192
+ - **Class 34:** "coffee_cup"
193
+ - **Class 35:** "compass"
194
+ - **Class 36:** "computer"
195
+ - **Class 37:** "cow"
196
+ - **Class 38:** "crab"
197
+ - **Class 39:** "crocodile"
198
+ - **Class 40:** "cruise_ship"
199
+ - **Class 41:** "dog"
200
+ - **Class 42:** "dolphin"
201
+ - **Class 43:** "dragon"
202
+ - **Class 44:** "drums"
203
+ - **Class 45:** "duck"
204
+ - **Class 46:** "dumbbell"
205
+ - **Class 47:** "elephant"
206
+ - **Class 48:** "eyeglasses"
207
+ - **Class 49:** "feather"
208
+ - **Class 50:** "fence"
209
+ - **Class 51:** "fish"
210
+ - **Class 52:** "flamingo"
211
+ - **Class 53:** "flower"
212
+ - **Class 54:** "foot"
213
+ - **Class 55:** "fork"
214
+ - **Class 56:** "frog"
215
+ - **Class 57:** "giraffe"
216
+ - **Class 58:** "goatee"
217
+ - **Class 59:** "grapes"
218
+ - **Class 60:** "guitar"
219
+ - **Class 61:** "hammer"
220
+ - **Class 62:** "helicopter"
221
+ - **Class 63:** "helmet"
222
+ - **Class 64:** "horse"
223
+ - **Class 65:** "kangaroo"
224
+ - **Class 66:** "lantern"
225
+ - **Class 67:** "laptop"
226
+ - **Class 68:** "leaf"
227
+ - **Class 69:** "lion"
228
+ - **Class 70:** "lipstick"
229
+ - **Class 71:** "lobster"
230
+ - **Class 72:** "microphone"
231
+ - **Class 73:** "monkey"
232
+ - **Class 74:** "mosquito"
233
+ - **Class 75:** "mouse"
234
+ - **Class 76:** "mug"
235
+ - **Class 77:** "mushroom"
236
+ - **Class 78:** "onion"
237
+ - **Class 79:** "panda"
238
+ - **Class 80:** "peanut"
239
+ - **Class 81:** "pear"
240
+ - **Class 82:** "peas"
241
+ - **Class 83:** "pencil"
242
+ - **Class 84:** "penguin"
243
+ - **Class 85:** "pig"
244
+ - **Class 86:** "pillow"
245
+ - **Class 87:** "pineapple"
246
+ - **Class 88:** "potato"
247
+ - **Class 89:** "power_outlet"
248
+ - **Class 90:** "purse"
249
+ - **Class 91:** "rabbit"
250
+ - **Class 92:** "raccoon"
251
+ - **Class 93:** "rhinoceros"
252
+ - **Class 94:** "rifle"
253
+ - **Class 95:** "saxophone"
254
+ - **Class 96:** "screwdriver"
255
+ - **Class 97:** "sea_turtle"
256
+ - **Class 98:** "see_saw"
257
+ - **Class 99:** "sheep"
258
+ - **Class 100:** "shoe"
259
+ - **Class 101:** "skateboard"
260
+ - **Class 102:** "snake"
261
+ - **Class 103:** "speedboat"
262
+ - **Class 104:** "spider"
263
+ - **Class 105:** "squirrel"
264
+ - **Class 106:** "strawberry"
265
+ - **Class 107:** "streetlight"
266
+ - **Class 108:** "string_bean"
267
+ - **Class 109:** "submarine"
268
+ - **Class 110:** "swan"
269
+ - **Class 111:** "table"
270
+ - **Class 112:** "teapot"
271
+ - **Class 113:** "teddy-bear"
272
+ - **Class 114:** "television"
273
+ - **Class 115:** "the_Eiffel_Tower"
274
+ - **Class 116:** "the_Great_Wall_of_China"
275
+ - **Class 117:** "tiger"
276
+ - **Class 118:** "toe"
277
+ - **Class 119:** "train"
278
+ - **Class 120:** "truck"
279
+ - **Class 121:** "umbrella"
280
+ - **Class 122:** "vase"
281
+ - **Class 123:** "watermelon"
282
+ - **Class 124:** "whale"
283
+ - **Class 125:** "zebra"
284
+
285
+ # **Run with Transformers🤗**
286
+
287
+ ```python
288
+ !pip install -q transformers torch pillow gradio
289
+ ```
290
+
291
+ ```python
292
+ import gradio as gr
293
+ from transformers import AutoImageProcessor
294
+ from transformers import SiglipForImageClassification
295
+ from transformers.image_utils import load_image
296
+ from PIL import Image
297
+ import torch
298
+
299
+ # Load model and processor
300
+ model_name = "prithivMLmods/Sketch-126-DomainNet"
301
+ model = SiglipForImageClassification.from_pretrained(model_name)
302
+ processor = AutoImageProcessor.from_pretrained(model_name)
303
+
304
+ def sketch_classification(image):
305
+ \"\"\"Predicts the sketch category for an input image.\"\"\n image = Image.fromarray(image).convert(\"RGB\")
306
+ inputs = processor(images=image, return_tensors=\"pt\")
307
+
308
+ with torch.no_grad():
309
+ outputs = model(**inputs)
310
+ logits = outputs.logits
311
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
312
+
313
+ labels = {
314
+ "0": "aircraft_carrier", "1": "alarm_clock", "2": "ant", "3": "anvil", "4": "asparagus",
315
+ "5": "axe", "6": "banana", "7": "basket", "8": "bathtub", "9": "bear",
316
+ "10": "bee", "11": "bird", "12": "blackberry", "13": "blueberry", "14": "bottlecap",
317
+ "15": "broccoli", "16": "bus", "17": "butterfly", "18": "cactus", "19": "cake",
318
+ "20": "calculator", "21": "camel", "22": "camera", "23": "candle", "24": "cannon",
319
+ "25": "canoe", "26": "carrot", "27": "castle", "28": "cat", "29": "ceiling_fan",
320
+ "30": "cell_phone", "31": "cello", "32": "chair", "33": "chandelier", "34": "coffee_cup",
321
+ "35": "compass", "36": "computer", "37": "cow", "38": "crab", "39": "crocodile",
322
+ "40": "cruise_ship", "41": "dog", "42": "dolphin", "43": "dragon", "44": "drums",
323
+ "45": "duck", "46": "dumbbell", "47": "elephant", "48": "eyeglasses", "49": "feather",
324
+ "50": "fence", "51": "fish", "52": "flamingo", "53": "flower", "54": "foot",
325
+ "55": "fork", "56": "frog", "57": "giraffe", "58": "goatee", "59": "grapes",
326
+ "60": "guitar", "61": "hammer", "62": "helicopter", "63": "helmet", "64": "horse",
327
+ "65": "kangaroo", "66": "lantern", "67": "laptop", "68": "leaf", "69": "lion",
328
+ "70": "lipstick", "71": "lobster", "72": "microphone", "73": "monkey", "74": "mosquito",
329
+ "75": "mouse", "76": "mug", "77": "mushroom", "78": "onion", "79": "panda",
330
+ "80": "peanut", "81": "pear", "82": "peas", "83": "pencil", "84": "penguin",
331
+ "85": "pig", "86": "pillow", "87": "pineapple", "88": "potato", "89": "power_outlet",
332
+ "90": "purse", "91": "rabbit", "92": "raccoon", "93": "rhinoceros", "94": "rifle",
333
+ "95": "saxophone", "96": "screwdriver", "97": "sea_turtle", "98": "see_saw", "99": "sheep",
334
+ "100": "shoe", "101": "skateboard", "102": "snake", "103": "speedboat", "104": "spider",
335
+ "105": "squirrel", "106": "strawberry", "107": "streetlight", "108": "string_bean",
336
+ "109": "submarine", "110": "swan", "111": "table", "112": "teapot", "113": "teddy-bear",
337
+ "114": "television", "115": "the_Eiffel_Tower", "116": "the_Great_Wall_of_China",
338
+ "117": "tiger", "118": "toe", "119": "train", "120": "truck", "121": "umbrella",
339
+ "122": "vase", "123": "watermelon", "124": "whale", "125": "zebra"
340
+ }
341
+
342
+ predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
343
+ return predictions
344
+
345
+ # Create Gradio interface
346
+ iface = gr.Interface(
347
+ fn=sketch_classification,
348
+ inputs=gr.Image(type=\"numpy\"),
349
+ outputs=gr.Label(label=\"Prediction Scores\"),
350
+ title=\"Sketch-126-DomainNet Classification\",
351
+ description=\"Upload a sketch to classify it into one of 126 categories.\"
352
+ )
353
+
354
+ # Launch the app
355
+ if __name__ == \"__main__\":
356
+ iface.launch()
357
+ ```
358
+
359
+ ---
360
+
361
+ # **Intended Use:**
362
+
363
+ The **Sketch-126-DomainNet** model is designed for sketch image classification. It is capable of categorizing sketches into a wide range of domains—from objects like an "aircraft_carrier" or "alarm_clock" to animals, plants, and everyday items. Potential use cases include:
364
+
365
+ - **Art and Design Applications:** Assisting artists and designers in organizing and retrieving sketches based on content.
366
+ - **Creative Search Engines:** Enabling sketch-based search for design inspiration.
367
+ - **Educational Tools:** Helping students and educators in art and design fields with categorization and retrieval of visual resources.
368
+ - **Computer Vision Research:** Providing a benchmark dataset for sketch recognition and domain adaptation tasks.