motorcycle-vit-model

A Vision Transformer (ViT) fine-tuned for motorcycle type classification into 4 categories: cruiser, sport, naked, roller.

Model Details

  • Base Architecture: google/vit-base-patch16-224
  • Fine-tuning: Transfer learning on custom motorcycle dataset
  • Framework: Hugging Face transformers + PyTorch
  • Task: Image Classification (4 classes)

Classes

Label Description
cruiser Low seat height, forward foot pegs, relaxed riding position
sport Full fairings, aggressive aerodynamic design
naked Minimal fairings, exposed engine, upright seating
roller Scooters with step-through frames and smaller wheels

Training

  • Dataset: ~34 images/class from Kaggle Vietnamese Bike and Motorbike Dataset
  • Split: 60% train / 20% validation / 20% test
  • Preprocessing: Resize to 224x224, ImageNet normalization
  • Optimizer: AdamW, lr=2e-5
  • Batch size: 16
  • Epochs: 5

Performance

Metric Value
Validation Accuracy 70.59%
Validation Loss 1.116
Class Accuracy
Cruiser 72%
Sport 68%
Naked 69%
Roller 73%

Usage

Demo

Live demo available at: nbacchi/abgabe2_motorbikes

Downloads last month
7
Safetensors
Model size
85.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using nbacchi/motorcycle-vit-model 1