Spaces:

adisaljusi
/

computer_vision_classification_model_comparison

Runtime error

App Files Files Community

computer_vision_classification_model_comparison / README.md

adisaljusi

Revise README for clarity and detailed model comparison

66fbd92 about 1 month ago

preview code

raw

history blame contribute delete

2.78 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

metadata

title: Computer Vision Classification Model Comparison
emoji: 📊
colorFrom: purple
colorTo: gray
sdk: gradio
sdk_version: 6.11.0
app_file: app.py
pinned: false
short_description: 'Block 2 '

CIFAR-10 Image Classification — Model Comparison

This app compares 3 image classification approaches on CIFAR-10 images:

Fine-tuned ViT model (adisaljusi/cifar10-vit)
Zero-shot CLIP (openai/clip-vit-large-patch14)
OpenAI vision model (gpt-4.1-mini)

Dataset Used For Training

Hugging Face dataset loader: load_dataset("uoft-cs/cifar10")
Dataset reference: https://huggingface.co/datasets/uoft-cs/cifar10
Number of classes: 10 (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)
Training subset: 8,000 images (from 50,000 total)
Test subset: 2,000 images (from 10,000 total)

Preprocessing

Resize from 32x32 to 224x224 (ViT input size)
Normalize pixel values with mean=0.5, std=0.5 per channel
Convert all images to RGB

Applied using AutoImageProcessor from google/vit-base-patch16-224.

Trained Model

Hugging Face model link: https://huggingface.co/adisaljusi/cifar10-vit
Base model: google/vit-base-patch16-224
Transfer learning: all layers frozen except the classification head (7,690 of 85.8M parameters trainable)
Training config: 4 epochs, batch size 32, learning rate 2e-4, warmup ratio 0.1, weight decay 0.01, AdamW optimizer

Training Performance

Training Loss	Epoch	Validation Loss	Accuracy
0.2316	1	0.2161	94.95%
0.1551	2	0.1516	95.65%
0.1230	3	0.1390	95.80%
0.1097	4	0.1363	95.95%

Example Image Results

Image	True Class	ViT Top-1 (score)	CLIP Top-1 (score)	OpenAI LLM (label, confidence)
`airplane.jpg`	`airplane`	`airplane` (0.675)	`airplane` (0.900)	`bird` (0.75)
`automobile.jpg`	`automobile`	`automobile` (0.656)	`automobile` (0.952)	`automobile` (0.85)
`cat.jpg`	`cat`	`cat` (0.954)	`cat` (0.536)	`cat` (0.85)
`dog.jpg`	`dog`	`dog` (0.988)	`dog` (0.936)	`dog` (0.85)
`horse.jpg`	`horse`	`horse` (0.998)	`horse` (0.990)	`horse` (0.95)
`ship.jpg`	`ship`	`ship` (0.989)	`ship` (0.996)	`ship` (0.95)