Image Classification
Transformers
Safetensors
English
siglip
Sketch-126-DomainNet
File size: 17,771 Bytes
7106a66
 
 
 
197059a
 
 
 
 
 
 
 
b9ee3b4
 
7c0544a
 
197059a
 
 
46beeff
197059a
46beeff
61e18f3
a774c30
7c41a20
 
b9ee3b4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46beeff
 
197059a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e4f15b4
197059a
 
 
 
 
 
 
 
 
 
e4f15b4
 
 
197059a
e4f15b4
 
 
 
197059a
 
 
e4f15b4
197059a
 
e4f15b4
197059a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e4f15b4
197059a
 
 
 
 
 
e4f15b4
 
 
 
197059a
 
 
e4f15b4
197059a
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
---
license: apache-2.0
datasets:
- Bruece/domainnet-126-by-class-sketch
language:
- en
base_model:
- google/siglip2-base-patch16-224
pipeline_tag: image-classification
library_name: transformers
tags:
- Sketch-126-DomainNet
---

![fdhsdftghd.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/iS8BrTcPZ38592IP_NW3z.png)

# **Sketch-126-DomainNet**

> **Sketch-126-DomainNet** is an image classification vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for a single-label classification task. It is designed to classify sketches into 126 domain categories using the **SiglipForImageClassification** architecture.

![Sketch-126-DomainNet - visual selection.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Rc6Q9-9_nSTV2mRicSqj1.png)

*Moment Matching for Multi-Source Domain Adaptation* : https://arxiv.org/pdf/1812.01754

*SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features* https://arxiv.org/pdf/2502.14786

```py
Classification Report:
                         precision    recall  f1-score   support

       aircraft_carrier     1.0000    0.2200    0.3607        50
            alarm_clock     0.9873    0.9568    0.9718       162
                    ant     0.9432    0.9326    0.9379        89
                  anvil     0.2727    0.0423    0.0732        71
              asparagus     0.9673    0.8916    0.9279       166
                    axe     0.8034    0.8773    0.8387       163
                 banana     0.9744    0.9383    0.9560       162
                 basket     0.7160    0.7682    0.7412       151
                bathtub     0.8073    0.9281    0.8635       167
                   bear     0.8636    0.6690    0.7540       142
                    bee     0.9196    0.8957    0.9075       115
                   bird     0.9094    0.9429    0.9259       245
             blackberry     1.0000    0.1250    0.2222        48
              blueberry     0.6744    0.8529    0.7532       102
              bottlecap     0.7468    0.5315    0.6211       111
               broccoli     0.7727    0.9444    0.8500       144
                    bus     0.9302    0.8989    0.9143       178
              butterfly     0.9594    0.9497    0.9545       199
                 cactus     1.0000    0.6735    0.8049        49
                   cake     0.0000    0.0000    0.0000        54
             calculator     0.9298    0.9636    0.9464        55
                  camel     0.9208    0.8942    0.9073       104
                 camera     0.9200    0.7931    0.8519        87
                 candle     0.9556    0.6935    0.8037        62
                 cannon     0.7500    0.2027    0.3191        74
                  canoe     0.8000    0.5825    0.6742       103
                 carrot     0.0000    0.0000    0.0000        27
                 castle     0.9583    0.5111    0.6667        45
                    cat     0.8961    0.6635    0.7624       104
            ceiling_fan     0.0000    0.0000    0.0000        20
             cell_phone     0.0000    0.0000    0.0000        18
                  cello     0.9600    0.4706    0.6316        51
                  chair     0.8043    0.4805    0.6016        77
             chandelier     0.0000    0.0000    0.0000        27
             coffee_cup     0.0000    0.0000    0.0000        26
                compass     0.0000    0.0000    0.0000        10
               computer     0.2500    0.0435    0.0741        23
                    cow     0.0000    0.0000    0.0000        14
                   crab     0.9123    0.8525    0.8814       122
              crocodile     0.9280    0.8992    0.9134       129
            cruise_ship     0.7467    0.9032    0.8175       124
                    dog     0.8533    0.8911    0.8718       248
                dolphin     0.9091    0.8824    0.8955        68
                 dragon     0.7914    0.8269    0.8088       156
                  drums     0.9259    0.8772    0.9009       171
                   duck     0.8409    0.8409    0.8409       220
               dumbbell     0.9507    0.9184    0.9343       147
               elephant     0.9630    0.9765    0.9697       213
             eyeglasses     0.8155    0.7919    0.8035       173
                feather     0.9344    0.9344    0.9344       244
                  fence     0.8796    0.8482    0.8636       112
                   fish     0.9527    0.9495    0.9511       297
               flamingo     0.9818    0.9474    0.9643       114
                 flower     0.8267    0.9219    0.8717       269
                   foot     0.7743    0.8578    0.8140       204
                   fork     0.9366    0.9433    0.9399       141
                   frog     0.9620    0.9383    0.9500       162
                giraffe     0.9655    0.9396    0.9524       149
                 goatee     0.7914    0.8897    0.8377       145
                 grapes     0.9132    0.9609    0.9364       230
                 guitar     0.8462    0.9862    0.9108       145
                 hammer     0.8333    0.4386    0.5747        57
             helicopter     0.9441    0.9620    0.9530       158
                 helmet     0.8509    0.8204    0.8354       167
                  horse     0.9091    0.9877    0.9467        81
               kangaroo     0.9592    0.9691    0.9641        97
                lantern     0.0000    0.0000    0.0000        30
                 laptop     0.8273    0.9200    0.8712       250
                   leaf     0.8449    0.8870    0.8655       301
                   lion     0.9697    0.9734    0.9715       263
               lipstick     0.9634    0.8977    0.9294        88
                lobster     0.9265    0.9130    0.9197       138
             microphone     0.8917    0.8770    0.8843       122
                 monkey     0.9297    0.8947    0.9119       133
               mosquito     0.9052    0.9211    0.9130       114
                  mouse     0.8632    0.8039    0.8325       102
                    mug     0.6928    0.7737    0.7310       137
               mushroom     0.8174    0.8861    0.8504       202
                  onion     0.9538    0.9841    0.9688       126
                  panda     0.9643    0.8710    0.9153        62
                 peanut     0.8302    0.8462    0.8381       104
                   pear     0.7966    0.9658    0.8731       146
                   peas     0.6667    0.8438    0.7448        64
                 pencil     0.0000    0.0000    0.0000        21
                penguin     0.9586    0.9701    0.9643       167
                    pig     0.8983    0.8785    0.8883       181
                 pillow     0.9570    0.9674    0.9622        92
              pineapple     0.9808    0.9714    0.9761       105
                 potato     0.9444    0.5231    0.6733        65
           power_outlet     0.5556    0.0676    0.1205        74
                  purse     0.9220    0.7182    0.8075       181
                 rabbit     0.9697    0.8767    0.9209        73
                raccoon     0.7850    0.9097    0.8428       277
             rhinoceros     0.9863    0.9863    0.9863       146
                  rifle     0.9143    0.9796    0.9458        98
              saxophone     0.9381    0.8618    0.8983       246
            screwdriver     0.7709    0.8706    0.8177       286
             sea_turtle     0.9698    0.9507    0.9602       203
                see_saw     0.3296    0.5738    0.4187       413
                  sheep     0.9254    0.9153    0.9203       366
                   shoe     0.9395    0.9688    0.9539       513
             skateboard     0.7365    0.7831    0.7591       332
                  snake     0.8005    0.8737    0.8355       372
              speedboat     0.8388    0.8833    0.8605       377
                 spider     0.7954    0.8696    0.8309       514
               squirrel     0.8511    0.8484    0.8498       310
             strawberry     0.8313    0.8471    0.8391       157
            streetlight     0.7944    0.8134    0.8038       209
            string_bean     0.7143    0.3000    0.4225        50
              submarine     0.5916    0.6975    0.6402       162
                   swan     0.8966    0.8387    0.8667       186
                  table     0.6705    0.7522    0.7090       230
                 teapot     0.8464    0.8968    0.8709       252
             teddy-bear     0.6818    0.8385    0.7521       161
             television     0.8974    0.7071    0.7910        99
       the_Eiffel_Tower     0.9860    0.9679    0.9769       218
the_Great_Wall_of_China     0.6389    0.8440    0.7273       109
                  tiger     0.9417    0.9604    0.9510       303
                    toe     0.0000    0.0000    0.0000        53
                  train     0.8650    0.9010    0.8827       192
                  truck     0.8136    0.9372    0.8710       191
               umbrella     0.8650    0.8913    0.8779       230
                   vase     0.8082    0.8082    0.8082       146
             watermelon     0.8947    0.8333    0.8629       102
                  whale     0.8910    0.8744    0.8826       215
                  zebra     0.9817    0.9727    0.9772       220

               accuracy                         0.8440     19317
              macro avg     0.7818    0.7419    0.7475     19317
           weighted avg     0.8404    0.8440    0.8352     19317
```

The model categorizes images into the following 126 classes:
- **Class 0:** "aircraft_carrier"
- **Class 1:** "alarm_clock"
- **Class 2:** "ant"
- **Class 3:** "anvil"
- **Class 4:** "asparagus"
- **Class 5:** "axe"
- **Class 6:** "banana"
- **Class 7:** "basket"
- **Class 8:** "bathtub"
- **Class 9:** "bear"
- **Class 10:** "bee"
- **Class 11:** "bird"
- **Class 12:** "blackberry"
- **Class 13:** "blueberry"
- **Class 14:** "bottlecap"
- **Class 15:** "broccoli"
- **Class 16:** "bus"
- **Class 17:** "butterfly"
- **Class 18:** "cactus"
- **Class 19:** "cake"
- **Class 20:** "calculator"
- **Class 21:** "camel"
- **Class 22:** "camera"
- **Class 23:** "candle"
- **Class 24:** "cannon"
- **Class 25:** "canoe"
- **Class 26:** "carrot"
- **Class 27:** "castle"
- **Class 28:** "cat"
- **Class 29:** "ceiling_fan"
- **Class 30:** "cell_phone"
- **Class 31:** "cello"
- **Class 32:** "chair"
- **Class 33:** "chandelier"
- **Class 34:** "coffee_cup"
- **Class 35:** "compass"
- **Class 36:** "computer"
- **Class 37:** "cow"
- **Class 38:** "crab"
- **Class 39:** "crocodile"
- **Class 40:** "cruise_ship"
- **Class 41:** "dog"
- **Class 42:** "dolphin"
- **Class 43:** "dragon"
- **Class 44:** "drums"
- **Class 45:** "duck"
- **Class 46:** "dumbbell"
- **Class 47:** "elephant"
- **Class 48:** "eyeglasses"
- **Class 49:** "feather"
- **Class 50:** "fence"
- **Class 51:** "fish"
- **Class 52:** "flamingo"
- **Class 53:** "flower"
- **Class 54:** "foot"
- **Class 55:** "fork"
- **Class 56:** "frog"
- **Class 57:** "giraffe"
- **Class 58:** "goatee"
- **Class 59:** "grapes"
- **Class 60:** "guitar"
- **Class 61:** "hammer"
- **Class 62:** "helicopter"
- **Class 63:** "helmet"
- **Class 64:** "horse"
- **Class 65:** "kangaroo"
- **Class 66:** "lantern"
- **Class 67:** "laptop"
- **Class 68:** "leaf"
- **Class 69:** "lion"
- **Class 70:** "lipstick"
- **Class 71:** "lobster"
- **Class 72:** "microphone"
- **Class 73:** "monkey"
- **Class 74:** "mosquito"
- **Class 75:** "mouse"
- **Class 76:** "mug"
- **Class 77:** "mushroom"
- **Class 78:** "onion"
- **Class 79:** "panda"
- **Class 80:** "peanut"
- **Class 81:** "pear"
- **Class 82:** "peas"
- **Class 83:** "pencil"
- **Class 84:** "penguin"
- **Class 85:** "pig"
- **Class 86:** "pillow"
- **Class 87:** "pineapple"
- **Class 88:** "potato"
- **Class 89:** "power_outlet"
- **Class 90:** "purse"
- **Class 91:** "rabbit"
- **Class 92:** "raccoon"
- **Class 93:** "rhinoceros"
- **Class 94:** "rifle"
- **Class 95:** "saxophone"
- **Class 96:** "screwdriver"
- **Class 97:** "sea_turtle"
- **Class 98:** "see_saw"
- **Class 99:** "sheep"
- **Class 100:** "shoe"
- **Class 101:** "skateboard"
- **Class 102:** "snake"
- **Class 103:** "speedboat"
- **Class 104:** "spider"
- **Class 105:** "squirrel"
- **Class 106:** "strawberry"
- **Class 107:** "streetlight"
- **Class 108:** "string_bean"
- **Class 109:** "submarine"
- **Class 110:** "swan"
- **Class 111:** "table"
- **Class 112:** "teapot"
- **Class 113:** "teddy-bear"
- **Class 114:** "television"
- **Class 115:** "the_Eiffel_Tower"
- **Class 116:** "the_Great_Wall_of_China"
- **Class 117:** "tiger"
- **Class 118:** "toe"
- **Class 119:** "train"
- **Class 120:** "truck"
- **Class 121:** "umbrella"
- **Class 122:** "vase"
- **Class 123:** "watermelon"
- **Class 124:** "whale"
- **Class 125:** "zebra"

# **Run with Transformers🤗**

```python
!pip install -q transformers torch pillow gradio
```

```python
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from transformers.image_utils import load_image
from PIL import Image
import torch

# Load model and processor
model_name = "prithivMLmods/Sketch-126-DomainNet"
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

def sketch_classification(image):
    """Predicts the sketch category for an input image."""
    # Convert the input numpy array to a PIL Image and ensure it has 3 channels (RGB)
    image = Image.fromarray(image).convert("RGB")
    
    # Process the image and prepare it for the model
    inputs = processor(images=image, return_tensors="pt")
    
    # Perform inference without gradient calculation
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        # Convert logits to probabilities using softmax
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
    
    # Mapping from indices to corresponding sketch category labels
    labels = {
        "0": "aircraft_carrier", "1": "alarm_clock", "2": "ant", "3": "anvil", "4": "asparagus",
        "5": "axe", "6": "banana", "7": "basket", "8": "bathtub", "9": "bear",
        "10": "bee", "11": "bird", "12": "blackberry", "13": "blueberry", "14": "bottlecap",
        "15": "broccoli", "16": "bus", "17": "butterfly", "18": "cactus", "19": "cake",
        "20": "calculator", "21": "camel", "22": "camera", "23": "candle", "24": "cannon",
        "25": "canoe", "26": "carrot", "27": "castle", "28": "cat", "29": "ceiling_fan",
        "30": "cell_phone", "31": "cello", "32": "chair", "33": "chandelier", "34": "coffee_cup",
        "35": "compass", "36": "computer", "37": "cow", "38": "crab", "39": "crocodile",
        "40": "cruise_ship", "41": "dog", "42": "dolphin", "43": "dragon", "44": "drums",
        "45": "duck", "46": "dumbbell", "47": "elephant", "48": "eyeglasses", "49": "feather",
        "50": "fence", "51": "fish", "52": "flamingo", "53": "flower", "54": "foot",
        "55": "fork", "56": "frog", "57": "giraffe", "58": "goatee", "59": "grapes",
        "60": "guitar", "61": "hammer", "62": "helicopter", "63": "helmet", "64": "horse",
        "65": "kangaroo", "66": "lantern", "67": "laptop", "68": "leaf", "69": "lion",
        "70": "lipstick", "71": "lobster", "72": "microphone", "73": "monkey", "74": "mosquito",
        "75": "mouse", "76": "mug", "77": "mushroom", "78": "onion", "79": "panda",
        "80": "peanut", "81": "pear", "82": "peas", "83": "pencil", "84": "penguin",
        "85": "pig", "86": "pillow", "87": "pineapple", "88": "potato", "89": "power_outlet",
        "90": "purse", "91": "rabbit", "92": "raccoon", "93": "rhinoceros", "94": "rifle",
        "95": "saxophone", "96": "screwdriver", "97": "sea_turtle", "98": "see_saw", "99": "sheep",
        "100": "shoe", "101": "skateboard", "102": "snake", "103": "speedboat", "104": "spider",
        "105": "squirrel", "106": "strawberry", "107": "streetlight", "108": "string_bean",
        "109": "submarine", "110": "swan", "111": "table", "112": "teapot", "113": "teddy-bear",
        "114": "television", "115": "the_Eiffel_Tower", "116": "the_Great_Wall_of_China",
        "117": "tiger", "118": "toe", "119": "train", "120": "truck", "121": "umbrella",
        "122": "vase", "123": "watermelon", "124": "whale", "125": "zebra"
    }
    
    # Create a dictionary mapping each label to its predicted probability (rounded)
    predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
    return predictions

# Create Gradio interface
iface = gr.Interface(
    fn=sketch_classification,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Label(label="Prediction Scores"),
    title="Sketch-126-DomainNet Classification",
    description="Upload a sketch to classify it into one of 126 categories."
)

# Launch the app
if __name__ == "__main__":
    iface.launch()
```

---

# **Intended Use:**

The **Sketch-126-DomainNet** model is designed for sketch image classification. It is capable of categorizing sketches into a wide range of domains—from objects like an "aircraft_carrier" or "alarm_clock" to animals, plants, and everyday items. Potential use cases include:

- **Art and Design Applications:** Assisting artists and designers in organizing and retrieving sketches based on content.
- **Creative Search Engines:** Enabling sketch-based search for design inspiration.
- **Educational Tools:** Helping students and educators in art and design fields with categorization and retrieval of visual resources.
- **Computer Vision Research:** Providing a benchmark dataset for sketch recognition and domain adaptation tasks.