roadwork_v3_focal

Vision Transformer (ViT) model for binary roadwork detection, trained for Natix Subnet 72.

Model Details

  • Base Model: google/vit-base-patch16-224-in21k
  • Architecture: ViT-Base (86M parameters)
  • Input Size: 224×224
  • Output Classes: 2 (None, Roadwork)

Performance

Metric Value
Accuracy 0.9839
MCC 0.9443
F1 Score 0.9903
AUC 0.9950
Subnet Reward 0.8962

Training Details

  • Training Samples: 34,968
  • Validation Samples: 4,418
  • Validator Augmentations: True
  • Label Smoothing: 0.05
  • Confidence Margin: 0.1

Usage

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image

processor = AutoImageProcessor.from_pretrained("infinite000/in-20001")
model = AutoModelForImageClassification.from_pretrained("infinite000/in-20001")

image = Image.open("your_image.jpg")
inputs = processor(image, return_tensors="pt")
outputs = model(**inputs)
probs = outputs.logits.softmax(dim=1)
roadwork_prob = probs[0][1].item()

Model Card

See model_card.json for detailed metadata.

Downloads last month
2
Safetensors
Model size
85.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support