prithivMLmods commited on
Commit
4ef361e
·
verified ·
1 Parent(s): 7ba5af8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md CHANGED
@@ -4,6 +4,14 @@ datasets:
4
  - frgfm/imagenette
5
  ---
6
 
 
 
 
 
 
 
 
 
7
  ```py
8
  Classification Report:
9
  precision recall f1-score support
@@ -25,3 +33,97 @@ english springer 0.9843 0.9822 0.9832 955
25
  ```
26
 
27
  ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/74PN9tMCvZIfg_qegVOa9.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - frgfm/imagenette
5
  ---
6
 
7
+ ![3.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XKcsM33R3XKl5JBBfQHNM.png)
8
+
9
+ # IMAGENETTE
10
+
11
+ > IMAGENETTE is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for multi-class image classification. It is trained to classify images into 10 categories from the popular Imagenette dataset using the SiglipForImageClassification architecture.
12
+
13
+
14
+
15
  ```py
16
  Classification Report:
17
  precision recall f1-score support
 
33
  ```
34
 
35
  ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/74PN9tMCvZIfg_qegVOa9.png)
36
+
37
+ ---
38
+
39
+ ## Label Space: 10 Classes
40
+
41
+ The model predicts one of the following image classes:
42
+
43
+ ```
44
+ 0: tench
45
+ 1: english springer
46
+ 2: cassette player
47
+ 3: chain saw
48
+ 4: church
49
+ 5: French horn
50
+ 6: garbage truck
51
+ 7: gas pump
52
+ 8: golf ball
53
+ 9: parachute
54
+ ```
55
+
56
+ ---
57
+
58
+ ## Install Dependencies
59
+
60
+ ```bash
61
+ pip install -q transformers torch pillow gradio hf_xet
62
+ ```
63
+
64
+ ---
65
+
66
+ ## Inference Code
67
+
68
+ ```python
69
+ import gradio as gr
70
+ from transformers import AutoImageProcessor, SiglipForImageClassification
71
+ from PIL import Image
72
+ import torch
73
+
74
+ # Load model and processor
75
+ model_name = "prithivMLmods/IMAGENETTE"
76
+ model = SiglipForImageClassification.from_pretrained(model_name)
77
+ processor = AutoImageProcessor.from_pretrained(model_name)
78
+
79
+ # Label mapping
80
+ id2label = {
81
+ "0": "tench",
82
+ "1": "english springer",
83
+ "2": "cassette player",
84
+ "3": "chain saw",
85
+ "4": "church",
86
+ "5": "French horn",
87
+ "6": "garbage truck",
88
+ "7": "gas pump",
89
+ "8": "golf ball",
90
+ "9": "parachute"
91
+ }
92
+
93
+ def classify_image(image):
94
+ image = Image.fromarray(image).convert("RGB")
95
+ inputs = processor(images=image, return_tensors="pt")
96
+
97
+ with torch.no_grad():
98
+ outputs = model(**inputs)
99
+ logits = outputs.logits
100
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
101
+
102
+ prediction = {
103
+ id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
104
+ }
105
+
106
+ return prediction
107
+
108
+ # Gradio Interface
109
+ iface = gr.Interface(
110
+ fn=classify_image,
111
+ inputs=gr.Image(type="numpy"),
112
+ outputs=gr.Label(num_top_classes=3, label="Image Classification"),
113
+ title="IMAGENETTE - SigLIP2 Classifier",
114
+ description="Upload an image to classify it into one of 10 categories from the Imagenette dataset."
115
+ )
116
+
117
+ if __name__ == "__main__":
118
+ iface.launch()
119
+ ```
120
+
121
+ ---
122
+
123
+ ## Intended Use
124
+
125
+ IMAGENETTE is designed for:
126
+
127
+ * Educational purposes and model benchmarking.
128
+ * Demonstrating the performance of SigLIP2 on a small but diverse classification task.
129
+ * Fine-tuning workflows on vision-language models.