CCT-S for US License Plate OCR

Compact Convolutional Transformer (CCT-S) fine-tuned for US license plate character recognition using fast-plate-ocr. Reads 8-character US plates from cropped 140x70 grayscale images.

Results

Metric	Value
Plate accuracy	83.6%
Character accuracy	95.2%
Plate length accuracy	95.0%
Top-3 accuracy	97.8%

Model Details

Parameter	Value
Architecture	CCT-S (6 transformer layers, 2 heads, dim 128)
Tokenizer	4 Conv2D (48→80→96→128), MaxBlurPooling2D, GELU
Input	140x70 grayscale
Max plate slots	8
Alphabet	`0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_` (37 chars)
Normalization	DYT (Dynamic Tanh)
Pretrained from	Global (Argentinian plates)

Training

See Autolane/alpr-finetuning for the full training pipeline.

W&B project: goautolane/alpr-ocr

Hyperparameters

Optimizer: AdamW with EMA
Learning rate: 5e-4 (cosine decay, 5% warmup)
Weight decay: 5e-4
Batch size: 64
Max epochs: 150
Early stopping: patience 50 on val_plate_acc
Precision: mixed_float16
Augmentation: Albumentations (Affine, BrightnessContrast, blur, noise, dropout)

Usage

from fast_plate_ocr import ONNXPlateRecognizer

# Using ONNX runtime (recommended for inference)
ocr = ONNXPlateRecognizer("us-plate-model")
plates = ocr.run(["plate_crop.jpg"])

Or with Keras directly:

import keras

model = keras.models.load_model("best_checkpoint.keras")
# Preprocess: resize to 140x70, grayscale, normalize to [0, 1]
predictions = model.predict(preprocessed_images)

Files

best_checkpoint.keras — Best checkpoint (83.6% plate accuracy, epoch 21)
model_config.yaml — CCT-S architecture config
plate_config.yaml — Plate dimensions, alphabet, preprocessing

Dataset

Trained on ~20k US license plate crops (80/20 train/val split, stratified by US State). See Autolane/us-license-plates.

Downloads last month: 6