adisaljusi
Revise README for clarity and detailed model comparison
66fbd92
---
title: Computer Vision Classification Model Comparison
emoji: "\U0001F4CA"
colorFrom: purple
colorTo: gray
sdk: gradio
sdk_version: 6.11.0
app_file: app.py
pinned: false
short_description: 'Block 2 '
---
# CIFAR-10 Image Classification — Model Comparison
This app compares 3 image classification approaches on CIFAR-10 images:
- Fine-tuned ViT model [(`adisaljusi/cifar10-vit`)](https://huggingface.co/adisaljusi/cifar10-vit)
- Zero-shot CLIP (`openai/clip-vit-large-patch14`)
- OpenAI vision model (`gpt-4.1-mini`)
## Dataset Used For Training
- Hugging Face dataset loader: `load_dataset("uoft-cs/cifar10")`
- Dataset reference: https://huggingface.co/datasets/uoft-cs/cifar10
- Number of classes: `10` (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)
- Training subset: 8,000 images (from 50,000 total)
- Test subset: 2,000 images (from 10,000 total)
## Preprocessing
- Resize from 32x32 to 224x224 (ViT input size)
- Normalize pixel values with mean=0.5, std=0.5 per channel
- Convert all images to RGB
Applied using `AutoImageProcessor` from `google/vit-base-patch16-224`.
## Trained Model
- Hugging Face model link: [https://huggingface.co/adisaljusi/cifar10-vit](https://huggingface.co/adisaljusi/cifar10-vit)
- Base model: [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224)
- Transfer learning: all layers frozen except the classification head (7,690 of 85.8M parameters trainable)
- Training config: 4 epochs, batch size 32, learning rate 2e-4, warmup ratio 0.1, weight decay 0.01, AdamW optimizer
## Training Performance
| Training Loss | Epoch | Validation Loss | Accuracy |
|--------------:|------:|----------------:|---------:|
| 0.2316 | 1 | 0.2161 | 94.95% |
| 0.1551 | 2 | 0.1516 | 95.65% |
| 0.1230 | 3 | 0.1390 | 95.80% |
| 0.1097 | 4 | 0.1363 | 95.95% |
## Example Image Results
| Image | True Class | ViT Top-1 (score) | CLIP Top-1 (score) | OpenAI LLM (label, confidence) |
|---|---|---|---|---|
| `airplane.jpg` | `airplane` | `airplane` (0.675) | `airplane` (0.900) | `bird` (0.75) |
| `automobile.jpg` | `automobile` | `automobile` (0.656) | `automobile` (0.952) | `automobile` (0.85) |
| `cat.jpg` | `cat` | `cat` (0.954) | `cat` (0.536) | `cat` (0.85) |
| `dog.jpg` | `dog` | `dog` (0.988) | `dog` (0.936) | `dog` (0.85) |
| `horse.jpg` | `horse` | `horse` (0.998) | `horse` (0.990) | `horse` (0.95) |
| `ship.jpg` | `ship` | `ship` (0.989) | `ship` (0.996) | `ship` (0.95) |
## Links
- Model: [https://huggingface.co/adisaljusi/cifar10-vit](https://huggingface.co/adisaljusi/cifar10-vit)
- App: [https://huggingface.co/spaces/adisaljusi/computer-vision-classification-model-comparison](https://huggingface.co/spaces/adisaljusi/computer-vision-classification-model-comparison)