---
language: en
tags:
- image-classification
- vision-transformer
- transfer-learning
- motorcycles
datasets:
- imagefolder
metrics:
- accuracy
---

# motorcycle-vit-model

A Vision Transformer (ViT) fine-tuned for motorcycle type classification into 4 categories: **cruiser**, **sport**, **naked**, **roller**.

## Model Details

- **Base Architecture**: `google/vit-base-patch16-224`
- **Fine-tuning**: Transfer learning on custom motorcycle dataset
- **Framework**: Hugging Face `transformers` + PyTorch
- **Task**: Image Classification (4 classes)

## Classes

| Label | Description |
|---|---|
| cruiser | Low seat height, forward foot pegs, relaxed riding position |
| sport | Full fairings, aggressive aerodynamic design |
| naked | Minimal fairings, exposed engine, upright seating |
| roller | Scooters with step-through frames and smaller wheels |

## Training

- **Dataset**: ~34 images/class from [Kaggle Vietnamese Bike and Motorbike Dataset](https://www.kaggle.com/datasets/nqa112/vietnamese-bike-and-motorbike)
- **Split**: 60% train / 20% validation / 20% test
- **Preprocessing**: Resize to 224x224, ImageNet normalization
- **Optimizer**: AdamW, lr=2e-5
- **Batch size**: 16
- **Epochs**: 5

## Performance

| Metric | Value |
|---|---|
| Validation Accuracy | 70.59% |
| Validation Loss | 1.116 |

| Class | Accuracy |
|---|---|
| Cruiser | 72% |
| Sport | 68% |
| Naked | 69% |
| Roller | 73% |

## Usage


## Demo

Live demo available at: [nbacchi/abgabe2_motorbikes](https://huggingface.co/spaces/nbacchi/abgabe2_motorbikes)