adisaljusi
Revise README for clarity and detailed model comparison
66fbd92

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: Computer Vision Classification Model Comparison
emoji: 📊
colorFrom: purple
colorTo: gray
sdk: gradio
sdk_version: 6.11.0
app_file: app.py
pinned: false
short_description: 'Block 2 '

CIFAR-10 Image Classification — Model Comparison

This app compares 3 image classification approaches on CIFAR-10 images:

  • Fine-tuned ViT model (adisaljusi/cifar10-vit)
  • Zero-shot CLIP (openai/clip-vit-large-patch14)
  • OpenAI vision model (gpt-4.1-mini)

Dataset Used For Training

  • Hugging Face dataset loader: load_dataset("uoft-cs/cifar10")
  • Dataset reference: https://huggingface.co/datasets/uoft-cs/cifar10
  • Number of classes: 10 (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)
  • Training subset: 8,000 images (from 50,000 total)
  • Test subset: 2,000 images (from 10,000 total)

Preprocessing

  • Resize from 32x32 to 224x224 (ViT input size)
  • Normalize pixel values with mean=0.5, std=0.5 per channel
  • Convert all images to RGB

Applied using AutoImageProcessor from google/vit-base-patch16-224.

Trained Model

Training Performance

Training Loss Epoch Validation Loss Accuracy
0.2316 1 0.2161 94.95%
0.1551 2 0.1516 95.65%
0.1230 3 0.1390 95.80%
0.1097 4 0.1363 95.95%

Example Image Results

Image True Class ViT Top-1 (score) CLIP Top-1 (score) OpenAI LLM (label, confidence)
airplane.jpg airplane airplane (0.675) airplane (0.900) bird (0.75)
automobile.jpg automobile automobile (0.656) automobile (0.952) automobile (0.85)
cat.jpg cat cat (0.954) cat (0.536) cat (0.85)
dog.jpg dog dog (0.988) dog (0.936) dog (0.85)
horse.jpg horse horse (0.998) horse (0.990) horse (0.95)
ship.jpg ship ship (0.989) ship (0.996) ship (0.95)

Links