YOLOX-S for US License Plate Detection
YOLOX-S fine-tuned for single-class license plate detection on ~20k US license plate images.
Results
| Metric | Value |
|---|---|
| AP@50 | 98.9% |
| AP@50:95 | 90.1% |
| AP@75 | 97.6% |
| AR@50:95 | 92.5% |
| AP small | 75.7% |
| AP medium | 87.9% |
| AP large | 91.9% |
| Inference | ~4.2ms/image (FP16, fused) |
Model Details
| Parameter | Value |
|---|---|
| Architecture | YOLOX-S (depth=0.33, width=0.50) |
| Classes | 1 (license_plate) |
| Input size | 640x640 |
| Pretrained from | COCO (yolox_s.pth) |
| Training epochs | 80 |
| Precision | FP16 |
Training
See Autolane/alpr-finetuning for the full training pipeline.
W&B project: goautolane/alpr-yolox
Hyperparameters
- Optimizer: SGD (momentum=0.9)
- Learning rate: 0.01/64 per image
- Weight decay: 5e-4
- Warmup: 5 epochs
- No-augmentation: last 10 epochs
- Augmentation: Mosaic, Mixup, HSV jitter, horizontal flip, rotation (+/-10deg)
Quantized Models (QAT)
Quantization-Aware Training (QAT) fine-tunes the model with simulated int8 noise so weights become robust to quantization. Post-training quantization (PTQ) loses -7.8 AP50:95; QAT recovers most of this gap.
| Metric | FP32 | int8 PTQ | Delta |
|---|---|---|---|
| AP@50:95 | 0.901 | 0.823 | -7.8 |
| AP@50 | 0.989 | 0.989 | 0.0 |
| AP@75 | 0.976 | 0.966 | -1.0 |
NPU Deployment (NXP i.MX93)
The Vela-compiled TFLite model targets the Arm Ethos-U65-256 NPU on NXP i.MX93:
- NPU utilization: 96.2% (12 TRANSPOSE ops fall back to CPU)
- Inference time: ~120ms per image (Ethos-U65-256 @ 1GHz)
- Model size: 7.7MB (Vela) vs 35MB (ONNX FP32)
ONNX Runtime Inference
import onnxruntime as ort
import numpy as np
session = ort.InferenceSession("yolox_plates_s.onnx")
# Input: [1, 3, 640, 640] float32, range [0, 255]
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: image_tensor})
Usage
import torch
from yolox.exp import get_exp
# Load experiment config
exp = get_exp("yolox_plates_s.py", None)
model = exp.get_model()
# Load weights
ckpt = torch.load("best_ckpt.pth", map_location="cpu")
model.load_state_dict(ckpt["model"])
model.eval()
Files
| File | Size | Description |
|---|---|---|
best_ckpt.pth |
69MB | FP32 PyTorch checkpoint (best validation AP) |
yolox_plates_s.py |
— | Experiment config (architecture + training hyperparameters) |
yolox_plates_s.onnx |
35MB | FP32 ONNX export (for ONNX Runtime inference) |
yolox_plates_s_int8.tflite |
8.9MB | int8 TFLite via QAT (for TFLite Runtime / CPU) |
yolox_plates_s_vela.tflite |
7.7MB | Vela-compiled int8 TFLite (for Ethos-U65-256 NPU) |
Dataset
Trained on ~20k US license plate images (80/20 train/val split, stratified by US State). See Autolane/us-license-plates.
- Downloads last month
- 18