Spaces:
Running
A newer version of the Gradio SDK is available: 6.15.2
Big Cat Classification Comparison App
This project compares two image classification approaches on big cat images:
- Fine-tuned ViT model (custom trained)
- Zero-shot CLIP model (
openai/clip-vit-base-patch32)
Dataset Used For Training
The dataset consists of images of five big cat species:
- cheetah
- leopard
- lion
- puma
- tiger
The images are organized using the imagefolder structure, where each class has its own folder.
The dataset was used to train a custom image classification model using transfer learning.
Preprocessing
The following preprocessing steps were applied:
- Images were loaded using the Hugging Face
imagefolderformat - Images were converted to RGB
- Images were resized automatically using the ViT image processor
- Labels were mapped to numerical IDs for training
Model and Evaluation
A Vision Transformer (ViT) model was fine-tuned on the custom dataset.
The model was evaluated using example images and compared with CLIP.
Accuracy
- Custom Model Accuracy: 1.00
- CLIP Accuracy: 1.00
Example Image Results
| Image | True Class | Custom Model (score) | CLIP (score) |
|---|---|---|---|
| Cheetah_032.jpg | cheetah | cheetah (0.53) | cheetah (0.83) |
| Leopard_001.jpg | leopard | leopard (0.51) | leopard (0.92) |
| Lion_003.jpg | lion | lion (0.54) | lion (0.99) |
| Puma_001.jpg | puma | puma (0.61) | puma (1.00) |
| Tiger_001.jpg | tiger | tiger (0.70) | tiger (0.99) |
Comparison Summary
Both the custom ViT model and CLIP achieved perfect accuracy (100%) on the test images.
The custom model shows slightly lower confidence scores compared to CLIP, but still predicts all classes correctly.
CLIP provides very high confidence predictions and performs strongly even without task-specific training.
Summary
- Best task-specific model: Custom ViT model
- Best open-source baseline: CLIP
Links to Model and App
Hugging Face Model:
https://huggingface.co/DKatheesrupan/aufgabe2Hugging Face Space (App):
https://huggingface.co/spaces/DKatheesrupan/Exercise2
Application
The application allows users to:
- upload an image
- test the custom model
- compare predictions with CLIP
- use example images directly
This enables a direct comparison between trained and zero-shot models.