nenzilea commited on
Commit
40b07a6
·
verified ·
1 Parent(s): d06d16b

Update readme

Browse files
Files changed (1) hide show
  1. readme_.md +14 -6
readme_.md CHANGED
@@ -2,13 +2,13 @@
2
 
3
  This app compares 3 image classification approaches on car images:
4
 
5
- - Fine-tuned ViT model ([`nenzilea/car-classification`](https://huggingface.co/nenzilea/car-classification))
6
  - Zero-shot CLIP (`openai/clip-vit-large-patch14`)
7
  - OpenAI vision model (GPT-4o image classification)
8
 
9
  ## Dataset Used For Training
10
 
11
- - Hugging Face dataset: `tanganke/stanford_cars`
12
  - The Stanford Cars dataset contains 196 fine-grained classes (car make/model/year combinations). We group them into 9 brand-level classes for a cleaner, more visually meaningful classification task.
13
  - Number of classes: `9`
14
  - Classes: `BMW`, `Dodge`, `Ferrari`, `Ford`, `Jeep`, `Lamborghini`, `Porsche`, `Rolls-Royce`, `Toyota`
@@ -24,7 +24,7 @@ This app compares 3 image classification approaches on car images:
24
 
25
  ## Trained Model
26
 
27
- - Hugging Face model link: [https://huggingface.co/nenzilea/car-classification](https://huggingface.co/nenzilea/car-classification)
28
  - Base model: `google/vit-base-patch16-224`
29
  - Only the final classification head was fine-tuned (all other layers frozen).
30
  - Trainable parameters: ~4,614 out of ~85.8M total.
@@ -41,7 +41,7 @@ This app compares 3 image classification approaches on car images:
41
 
42
  ## Hugging Face Space
43
 
44
- - App link: [https://huggingface.co/spaces/nenzilea/car-classification](https://huggingface.co/spaces/nenzilea/car-classification)
45
 
46
  ## Example Image Results
47
 
@@ -51,5 +51,13 @@ The table below reports the true class and Top-3 predictions for ViT, CLIP, and
51
  |---|---|---|---|---|
52
  | `Dodge.jpg` | `Dodge` | BMW (0.3564), Dodge (0.2218), Rolls-Royce (0.1807) | Dodge (0.9432), Jeep (0.0393), Lamborghini (0.0078) | Dodge (1.00) |
53
  | `Ferrari.jpg` | `Ferrari` | Ferrari (0.6007), Lamborghini (0.2946), Ford (0.0296) | Ferrari (0.9958), Lamborghini (0.0032), Ford (0.0004) | Ferrari (1.00) |
54
- | `BMW.jpg` | `BMW` | BMW (0.2737), Porsche (0.1800), Dodge (0.1630) | BMW (0.9969), Porsche (0.0014), Ferrari (0.0007) | BMW (1.00) |
55
- | `Porsche.jpg` | `Porsche` | BMW (0.5858), Dodge (0.2040), Toyota (0.0667) | Porsche (0.9887), Lamborghini (0.0047), Dodge (0.0022) | Porsche (1.00) |
 
 
 
 
 
 
 
 
 
2
 
3
  This app compares 3 image classification approaches on car images:
4
 
5
+ - Fine-tuned ViT model (`nenzilea/car-classification`)
6
  - Zero-shot CLIP (`openai/clip-vit-large-patch14`)
7
  - OpenAI vision model (GPT-4o image classification)
8
 
9
  ## Dataset Used For Training
10
 
11
+ - Hugging Face dataset: https://huggingface.co/datasets/tanganke/stanford_cars
12
  - The Stanford Cars dataset contains 196 fine-grained classes (car make/model/year combinations). We group them into 9 brand-level classes for a cleaner, more visually meaningful classification task.
13
  - Number of classes: `9`
14
  - Classes: `BMW`, `Dodge`, `Ferrari`, `Ford`, `Jeep`, `Lamborghini`, `Porsche`, `Rolls-Royce`, `Toyota`
 
24
 
25
  ## Trained Model
26
 
27
+ - Hugging Face model link: https://huggingface.co/nenzilea/car-classification
28
  - Base model: `google/vit-base-patch16-224`
29
  - Only the final classification head was fine-tuned (all other layers frozen).
30
  - Trainable parameters: ~4,614 out of ~85.8M total.
 
41
 
42
  ## Hugging Face Space
43
 
44
+ - App link: https://huggingface.co/spaces/nenzilea/car-classification
45
 
46
  ## Example Image Results
47
 
 
51
  |---|---|---|---|---|
52
  | `Dodge.jpg` | `Dodge` | BMW (0.3564), Dodge (0.2218), Rolls-Royce (0.1807) | Dodge (0.9432), Jeep (0.0393), Lamborghini (0.0078) | Dodge (1.00) |
53
  | `Ferrari.jpg` | `Ferrari` | Ferrari (0.6007), Lamborghini (0.2946), Ford (0.0296) | Ferrari (0.9958), Lamborghini (0.0032), Ford (0.0004) | Ferrari (1.00) |
54
+ | `BMW.jpg` | `BMW` | BMW (0.2737), Porsche (0.1800), Dodge (0.1630) | BMW (0.9969), Porsche (0.0014), Ferrari (0.0007) | BMW (0.95), Porsche (0.001), Dodge (0.001), Ferrari (0.001), Ford (0.001) |
55
+ | `Porsche.jpg` | `Porsche` | BMW (0.5858), Dodge (0.2040), Toyota (0.0667) | Porsche (0.9887), Lamborghini (0.0047), Dodge (0.0022) | Porsche (1.00) |
56
+
57
+ ## Model Comparison Summary
58
+
59
+ | Model | Approach | Strengths | Weaknesses |
60
+ |---|---|---|---|
61
+ | **Custom ViT** | Supervised fine-tuning on 9 car brands | High accuracy on known brands | Only classifies the 9 trained brands |
62
+ | **CLIP** | Zero-shot with brand name as text prompt | No training needed, flexible labels | Lower accuracy; may confuse visually similar brands |
63
+ | **OpenAI GPT-4o** | LLM vision with natural language prompt | Strong reasoning, handles unusual angles | API cost, latency, black-box |