Nav772 commited on
Commit
0bade36
·
verified ·
1 Parent(s): cb28fc3

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +71 -6
README.md CHANGED
@@ -10,6 +10,19 @@ tags:
10
  - vision
11
  - food-classification
12
  - vit
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
  # Vision Transformer (ViT) Fine-tuned on Food101 Subset
@@ -31,6 +44,62 @@ This model is a fine-tuned version of `google/vit-base-patch16-224` for food ima
31
  - tacos
32
  - ramen
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ## Training Data
35
 
36
  - **Dataset**: Food101 (subset)
@@ -46,18 +115,14 @@ This model is a fine-tuned version of `google/vit-base-patch16-224` for food ima
46
  - **Learning rate**: 3e-5
47
  - **Image size**: 224x224
48
  - **Mixed precision**: FP16
49
-
50
- ## Evaluation Results
51
-
52
- - **Accuracy**: 98.04%
53
 
54
  ## Usage
55
  ```python
56
  from transformers import pipeline
57
 
58
  classifier = pipeline("image-classification", model="Nav772/vit-food-classifier")
59
-
60
- # From local file
61
  result = classifier("path/to/food/image.jpg")
62
  print(result)
63
  ```
 
10
  - vision
11
  - food-classification
12
  - vit
13
+ model-index:
14
+ - name: vit-food-classifier
15
+ results:
16
+ - task:
17
+ type: image-classification
18
+ dataset:
19
+ name: food101
20
+ type: food101
21
+ split: validation
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.9804
26
  ---
27
 
28
  # Vision Transformer (ViT) Fine-tuned on Food101 Subset
 
44
  - tacos
45
  - ramen
46
 
47
+ ## Evaluation Results
48
+
49
+ | Metric | Value |
50
+ |--------|-------|
51
+ | **Accuracy** | 98.04% |
52
+
53
+ ## Training Logs
54
+
55
+ | Epoch | Training Loss | Validation Loss | Accuracy |
56
+ |-------|---------------|-----------------|----------|
57
+ | 1 | 0.3254 | 0.1076 | 97.20% |
58
+ | 2 | 0.1216 | 0.0904 | 97.68% |
59
+ | 3 | 0.0361 | 0.0770 | 97.88% |
60
+ | 4 | 0.0118 | 0.0764 | 98.00% |
61
+ | 5 | 0.0084 | 0.0767 | **98.04%** |
62
+
63
+ **Training Summary:**
64
+ - Total steps: 1,175
65
+ - Final training loss: 0.2446
66
+ - Training runtime: 2,705 seconds (~45 minutes)
67
+ - Throughput: 13.86 samples/second
68
+
69
+ ### Reproduce Evaluation
70
+ ```python
71
+ from datasets import load_dataset
72
+ from transformers import pipeline
73
+ from tqdm import tqdm
74
+
75
+ # Load model
76
+ classifier = pipeline("image-classification", model="Nav772/vit-food-classifier", device=0)
77
+
78
+ # Load same test split
79
+ dataset = load_dataset("food101", split="validation")
80
+
81
+ # Filter to same 10 classes
82
+ selected_classes = ["pizza", "sushi", "hamburger", "ice_cream", "steak",
83
+ "baklava", "cheesecake", "pancakes", "tacos", "ramen"]
84
+ class_names = dataset.features['label'].names
85
+ selected_indices = [class_names.index(c) for c in selected_classes]
86
+
87
+ filtered = dataset.filter(lambda x: x['label'] in selected_indices)
88
+
89
+ # Evaluate
90
+ correct = 0
91
+ total = 0
92
+
93
+ for example in tqdm(filtered):
94
+ pred = classifier(example['image'])[0]['label']
95
+ true_label = class_names[example['label']]
96
+ if pred == true_label:
97
+ correct += 1
98
+ total += 1
99
+
100
+ print(f"Accuracy: {correct/total:.4f} ({correct}/{total})")
101
+ ```
102
+
103
  ## Training Data
104
 
105
  - **Dataset**: Food101 (subset)
 
115
  - **Learning rate**: 3e-5
116
  - **Image size**: 224x224
117
  - **Mixed precision**: FP16
118
+ - **Warmup ratio**: 0.1
119
+ - **Weight decay**: 0.01
 
 
120
 
121
  ## Usage
122
  ```python
123
  from transformers import pipeline
124
 
125
  classifier = pipeline("image-classification", model="Nav772/vit-food-classifier")
 
 
126
  result = classifier("path/to/food/image.jpg")
127
  print(result)
128
  ```