YOLOX-S for US License Plate Detection

YOLOX-S fine-tuned for single-class license plate detection on ~20k US license plate images.

Results

Metric Value
AP@50 98.9%
AP@50:95 90.1%
AP@75 97.6%
AR@50:95 92.5%
AP small 75.7%
AP medium 87.9%
AP large 91.9%
Inference ~4.2ms/image (FP16, fused)

Model Details

Parameter Value
Architecture YOLOX-S (depth=0.33, width=0.50)
Classes 1 (license_plate)
Input size 640x640
Pretrained from COCO (yolox_s.pth)
Training epochs 80
Precision FP16

Training

See Autolane/alpr-finetuning for the full training pipeline.

W&B project: goautolane/alpr-yolox

Hyperparameters

  • Optimizer: SGD (momentum=0.9)
  • Learning rate: 0.01/64 per image
  • Weight decay: 5e-4
  • Warmup: 5 epochs
  • No-augmentation: last 10 epochs
  • Augmentation: Mosaic, Mixup, HSV jitter, horizontal flip, rotation (+/-10deg)

Quantized Models (QAT)

Quantization-Aware Training (QAT) fine-tunes the model with simulated int8 noise so weights become robust to quantization. Post-training quantization (PTQ) loses -7.8 AP50:95; QAT recovers most of this gap.

Metric FP32 int8 PTQ Delta
AP@50:95 0.901 0.823 -7.8
AP@50 0.989 0.989 0.0
AP@75 0.976 0.966 -1.0

NPU Deployment (NXP i.MX93)

The Vela-compiled TFLite model targets the Arm Ethos-U65-256 NPU on NXP i.MX93:

  • NPU utilization: 96.2% (12 TRANSPOSE ops fall back to CPU)
  • Inference time: ~120ms per image (Ethos-U65-256 @ 1GHz)
  • Model size: 7.7MB (Vela) vs 35MB (ONNX FP32)

ONNX Runtime Inference

import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("yolox_plates_s.onnx")
# Input: [1, 3, 640, 640] float32, range [0, 255]
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: image_tensor})

Usage

import torch
from yolox.exp import get_exp

# Load experiment config
exp = get_exp("yolox_plates_s.py", None)
model = exp.get_model()

# Load weights
ckpt = torch.load("best_ckpt.pth", map_location="cpu")
model.load_state_dict(ckpt["model"])
model.eval()

Files

File Size Description
best_ckpt.pth 69MB FP32 PyTorch checkpoint (best validation AP)
yolox_plates_s.py — Experiment config (architecture + training hyperparameters)
yolox_plates_s.onnx 35MB FP32 ONNX export (for ONNX Runtime inference)
yolox_plates_s_int8.tflite 8.9MB int8 TFLite via QAT (for TFLite Runtime / CPU)
yolox_plates_s_vela.tflite 7.7MB Vela-compiled int8 TFLite (for Ethos-U65-256 NPU)

Dataset

Trained on ~20k US license plate images (80/20 train/val split, stratified by US State). See Autolane/us-license-plates.

Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support