metadata
language: en
tags:
- image-classification
- vision-transformer
- transfer-learning
- motorcycles
datasets:
- imagefolder
metrics:
- accuracy
motorcycle-vit-model
A Vision Transformer (ViT) fine-tuned for motorcycle type classification into 4 categories: cruiser, sport, naked, roller.
Model Details
- Base Architecture:
google/vit-base-patch16-224 - Fine-tuning: Transfer learning on custom motorcycle dataset
- Framework: Hugging Face
transformers+ PyTorch - Task: Image Classification (4 classes)
Classes
| Label | Description |
|---|---|
| cruiser | Low seat height, forward foot pegs, relaxed riding position |
| sport | Full fairings, aggressive aerodynamic design |
| naked | Minimal fairings, exposed engine, upright seating |
| roller | Scooters with step-through frames and smaller wheels |
Training
- Dataset: ~34 images/class from Kaggle Vietnamese Bike and Motorbike Dataset
- Split: 60% train / 20% validation / 20% test
- Preprocessing: Resize to 224x224, ImageNet normalization
- Optimizer: AdamW, lr=2e-5
- Batch size: 16
- Epochs: 5
Performance
| Metric | Value |
|---|---|
| Validation Accuracy | 70.59% |
| Validation Loss | 1.116 |
| Class | Accuracy |
|---|---|
| Cruiser | 72% |
| Sport | 68% |
| Naked | 69% |
| Roller | 73% |
Usage
Demo
Live demo available at: nbacchi/abgabe2_motorbikes