Spaces:

adisaljusi
/

computer_vision_classification_model_comparison

Runtime error

App Files Files Community

computer_vision_classification_model_comparison / README.md

adisaljusi

Revise README for clarity and detailed model comparison

66fbd92 about 1 month ago

preview code

raw

history blame contribute delete

2.78 kB

	---
	title: Computer Vision Classification Model Comparison
	emoji: "\U0001F4CA"
	colorFrom: purple
	colorTo: gray
	sdk: gradio
	sdk_version: 6.11.0
	app_file: app.py
	pinned: false
	short_description: 'Block 2 '
	---

	# CIFAR-10 Image Classification — Model Comparison

	This app compares 3 image classification approaches on CIFAR-10 images:

	- Fine-tuned ViT model [(`adisaljusi/cifar10-vit`)](https://huggingface.co/adisaljusi/cifar10-vit)
	- Zero-shot CLIP (`openai/clip-vit-large-patch14`)
	- OpenAI vision model (`gpt-4.1-mini`)

	## Dataset Used For Training

	- Hugging Face dataset loader: `load_dataset("uoft-cs/cifar10")`
	- Dataset reference: https://huggingface.co/datasets/uoft-cs/cifar10
	- Number of classes: `10` (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)
	- Training subset: 8,000 images (from 50,000 total)
	- Test subset: 2,000 images (from 10,000 total)

	## Preprocessing

	- Resize from 32x32 to 224x224 (ViT input size)
	- Normalize pixel values with mean=0.5, std=0.5 per channel
	- Convert all images to RGB

	Applied using `AutoImageProcessor` from `google/vit-base-patch16-224`.

	## Trained Model

	- Hugging Face model link: [https://huggingface.co/adisaljusi/cifar10-vit](https://huggingface.co/adisaljusi/cifar10-vit)
	- Base model: [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224)
	- Transfer learning: all layers frozen except the classification head (7,690 of 85.8M parameters trainable)
	- Training config: 4 epochs, batch size 32, learning rate 2e-4, warmup ratio 0.1, weight decay 0.01, AdamW optimizer

	## Training Performance

	\| Training Loss \| Epoch \| Validation Loss \| Accuracy \|
	\|--------------:\|------:\|----------------:\|---------:\|
	\| 0.2316 \| 1 \| 0.2161 \| 94.95% \|
	\| 0.1551 \| 2 \| 0.1516 \| 95.65% \|
	\| 0.1230 \| 3 \| 0.1390 \| 95.80% \|
	\| 0.1097 \| 4 \| 0.1363 \| 95.95% \|

	## Example Image Results

	\| Image \| True Class \| ViT Top-1 (score) \| CLIP Top-1 (score) \| OpenAI LLM (label, confidence) \|
	\|---\|---\|---\|---\|---\|
	\| `airplane.jpg` \| `airplane` \| `airplane` (0.675) \| `airplane` (0.900) \| `bird` (0.75) \|
	\| `automobile.jpg` \| `automobile` \| `automobile` (0.656) \| `automobile` (0.952) \| `automobile` (0.85) \|
	\| `cat.jpg` \| `cat` \| `cat` (0.954) \| `cat` (0.536) \| `cat` (0.85) \|
	\| `dog.jpg` \| `dog` \| `dog` (0.988) \| `dog` (0.936) \| `dog` (0.85) \|
	\| `horse.jpg` \| `horse` \| `horse` (0.998) \| `horse` (0.990) \| `horse` (0.95) \|
	\| `ship.jpg` \| `ship` \| `ship` (0.989) \| `ship` (0.996) \| `ship` (0.95) \|

	## Links

	- Model: [https://huggingface.co/adisaljusi/cifar10-vit](https://huggingface.co/adisaljusi/cifar10-vit)
	- App: [https://huggingface.co/spaces/adisaljusi/computer-vision-classification-model-comparison](https://huggingface.co/spaces/adisaljusi/computer-vision-classification-model-comparison)