WINK Streaming

WINK LPR β€” Costa Rica Plate OCR

98.4% plate accuracy Β· 4.4MB ONNX Β· CTC architecture Β· Real-time inference

WINK Streaming License ONNX

Overview

A high-accuracy OCR model for reading Costa Rica license plates, developed by WINK Streaming as part of a production LPR (License Plate Recognition) system.

The model reads cropped plate images and outputs the plate text. It handles both daytime color and nighttime IR/grayscale camera feeds.

Model Details

Property Value
Architecture CNN (5 conv blocks) + BiLSTM (128 hidden Γ— 2 layers) + CTC decoder
Parameters 1,117,170
Input [batch, 96, 192, 1] β€” grayscale, NHWC, uint8 (0-255)
Output [batch, 48, 38] β€” 48 CTC timesteps, 38 classes
Alphabet 0-9, A-Z, _ (pad) + CTC blank (index 37)
ONNX size 4.4 MB
ONNX opset 18 (IR version 8)
Inference ~5ms CPU, ~2ms GPU

Performance

Metric Score
Plate accuracy (exact match) 98.4%
Character accuracy 97.8%
Validation set 9,074 crops
Training set 81,666 crops from 9,727 labeled detections

Evaluated on held-out 10% split of production data from Axis & Hikvision cameras (2304Γ—1296, color day / IR night).

Supported Plate Formats

  • Standard: 3 letters + 3 digits (e.g. AAP096)
  • Numeric: All digits (e.g. 672625)
  • Government: 2-3 letters + 2 digits
  • Motorcycle: All digits, 2-row format
  • Diplomat/Electric: Various formats
  • All 36 alphanumeric characters are valid (including O)

Quick Start

import numpy as np
import cv2
import onnxruntime as ort

ALPHABET = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_"
BLANK = 37  # CTC blank token

def load_and_preprocess(image_path, h=96, w=192):
    """Load image, convert to grayscale, resize with aspect-preserving padding."""
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
    src_h, src_w = img.shape[:2]
    scale = min(h / src_h, w / src_w)
    new_h, new_w = int(src_h * scale), int(src_w * scale)
    resized = cv2.resize(img, (new_w, new_h), interpolation=cv2.INTER_AREA)
    canvas = np.full((h, w), 128, dtype=np.uint8)
    y_off, x_off = (h - new_h) // 2, (w - new_w) // 2
    canvas[y_off:y_off+new_h, x_off:x_off+new_w] = resized
    return canvas

def ctc_decode(logits):
    """Greedy CTC decode: collapse repeats, remove blanks."""
    indices = np.argmax(logits, axis=-1)
    chars = []
    prev = -1
    for idx in indices:
        if idx != prev and idx != BLANK and idx < len(ALPHABET):
            chars.append(ALPHABET[idx])
        prev = idx
    return ''.join(chars)

def predict(session, image_path):
    """Run OCR on a plate crop image."""
    gray = load_and_preprocess(image_path)
    blob = gray.reshape(1, 96, 192, 1).astype(np.uint8)
    logits = session.run(None, {"input": blob})[0][0]  # [48, 38]
    return ctc_decode(logits)

# Usage
session = ort.InferenceSession("wink-lpr-ocr-cr.onnx")
plate_text = predict(session, "plate_crop.jpg")
print(f"Plate: {plate_text}")

Input Preprocessing

  1. Convert to grayscale (single channel)
  2. Resize with aspect-preserving padding to 96Γ—192
  3. Pad with gray (value 128)
  4. Input as uint8 [batch, 96, 192, 1] NHWC format β€” no normalization needed, the model handles it internally

Output Decoding

The model outputs [batch, 48, 38] logits β€” 48 CTC timesteps over 38 classes:

  • Indices 0-9: digits 0-9
  • Indices 10-35: letters A-Z
  • Index 36: pad/underscore _
  • Index 37: CTC blank token

Greedy CTC decode: Take argmax per timestep, collapse consecutive identical indices, remove blank tokens.

Training

  • Architecture: CNN feature extractor (5 blocks: 1β†’48β†’96β†’128β†’128 channels) with max-pooling that preserves width for CTC alignment, followed by BiLSTM (128 hidden, 2 layers, bidirectional) and linear classifier
  • Loss: CTC loss + confusion-pair auxiliary penalty for commonly confused characters in IR mode (3/7, 8/0, 1/7, etc.)
  • Data: 9,727 human-verified labels from production deployment (dual cameras, day + night)
  • Augmentation: Light (Β±3.5Β° rotation, Β±25 brightness, 0.85-1.15Γ— contrast)
  • Training: 200 epochs, batch 64, AdamW with cosine LR schedule, early stopping (patience 40)
  • Hardware: NVIDIA RTX 3060 12GB

The model was trained from scratch on hard labels (no knowledge distillation). Training data includes both positive examples (verified plate crops) and negative examples (non-plate crops like logos, signs) to teach the model to abstain on non-plate input.

Deployment Notes

  • Compatible with ONNX Runtime (CPU and GPU providers)
  • NHWC input format β€” no channel-first conversion needed
  • Works with .NET OnnxRuntime (IR version 8 for compatibility)
  • Designed for cropped plate images β€” pair with a plate detector (e.g. YOLO) for end-to-end ALPR
  • Handles both color (daytime) and IR/grayscale (nighttime) crops

Limitations

  • Trained specifically on Costa Rica plates β€” accuracy on other countries will be lower
  • Best performance on crops from fixed surveillance cameras (similar to training distribution)
  • Character confusion pairs in IR/night mode: 3↔7, 8↔0, 1↔7, 0↔D, 5↔S
  • Requires a separate plate detection model to crop plates from full frames

Citation

If you use this model, please cite:

@misc{wink-lpr-ocr-cr-2026,
  title={WINK LPR OCR β€” Costa Rica License Plate Recognition},
  author={WINK Streaming},
  year={2026},
  url={https://www.wink.co},
  note={98.4\% plate accuracy CTC-CRNN model}
}

License

CC-BY-4.0 β€” free for commercial and non-commercial use with attribution.

About WINK Streaming

WINK Streaming builds intelligent video infrastructure β€” from camera ingestion and AI-powered analytics to archival and playback. This model is part of our production LPR system.


Built by WINK Streaming
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results