roadwork_v3_focal
Vision Transformer (ViT) model for binary roadwork detection, trained for Natix Subnet 72.
Model Details
- Base Model: google/vit-base-patch16-224-in21k
- Architecture: ViT-Base (86M parameters)
- Input Size: 224×224
- Output Classes: 2 (None, Roadwork)
Performance
| Metric | Value |
|---|---|
| Accuracy | 0.9839 |
| MCC | 0.9443 |
| F1 Score | 0.9903 |
| AUC | 0.9950 |
| Subnet Reward | 0.8962 |
Training Details
- Training Samples: 34,968
- Validation Samples: 4,418
- Validator Augmentations: True
- Label Smoothing: 0.05
- Confidence Margin: 0.1
Usage
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
processor = AutoImageProcessor.from_pretrained("infinite000/in-20001")
model = AutoModelForImageClassification.from_pretrained("infinite000/in-20001")
image = Image.open("your_image.jpg")
inputs = processor(image, return_tensors="pt")
outputs = model(**inputs)
probs = outputs.logits.softmax(dim=1)
roadwork_prob = probs[0][1].item()
Model Card
See model_card.json for detailed metadata.
- Downloads last month
- 2