Helmet v5 β€” Indian CCTV Helmet + ANPR pipeline

End-to-end pipeline for detecting motorcycle riders without helmets and reading their license plates from Andhra Pradesh RTGS CCTV feeds. Built to match / beat Videonetics commercial ANPR on the same footage.

What's in this repo

Component File Purpose
Motorcycle + person detector models/yolo11l_cctv_ft.pt YOLO11l fine-tuned on 627 pseudo-labeled CCTV frames. mAP50 = 0.979, mAP50-95 = 0.917
Head helmet classifier models/helmet_head_v2.pt EfficientNet-B0 on driver head crops, 2-class. Val F1 = 0.864
Plate detector models/plate_yolo_l.pt, plate_yolo_ft.pt YOLO11l stock + fine-tuned for Indian plate bboxes
Plate OCR (TrOCR v1/v2/v3 on HF model repo β€” not bundled, train via tools/train_plate_ocr_v3.py)
Pose model yolo11l-pose.pt Stock Ultralytics pose β€” localizes driver head keypoints

Pipeline (tools/analyze_tripwire.py)

  1. YOLO fine-tuned detects motorcycle + person boxes per frame
  2. Assign nearest person to each motorcycle as driver (single rider per bike β€” pillions ignored)
  3. YOLO11l-pose finds driver head keypoints (nose/eyes/ears) β†’ aspect-preserved 224Γ—224 crop
  4. EfficientNet-B0 classifier: helmet vs no_helmet (multi-frame voting, MIN_VOTES=2)
  5. Plate detector + TrOCR on plate crop
  6. postprocess_dedup.py β€” plate-string dedup + time-window dedup for no-plate events

Training scripts (reproducible)

  • tools/extract_ft_frames.py β€” clip-level train/val split, variance-filter corrupted HEVC frames
  • tools/autolabel_frames.py β€” YOLO11l + YOLO11x ensemble pseudo-labeling, cross-model NMS @ confβ‰₯0.7
  • tools/train_yolo_ft.py β€” 50 epochs, imgsz=1280, cosine LR, close_mosaic=10, patience=10
  • tools/build_head_dataset.py β€” pose-extracted head crops, source-level split by eventId
  • tools/train_head_classifier.py β€” EfficientNet-B0 + PadResize, class-weighted CE, F1 early-stop
  • tools/train_plate_ocr_v3.py β€” TrOCR fine-tune on Indian plate regex ^[A-Z]{2}\d{2}[A-Z]{1,3}\d{3,4}$

Results on ch284_20260421_120655 (15-min clip)

Reference: Videonetics reported 77 no-helmet events. v5 matches after dedup.

Status

Last trained 2026-04-22. YOLO reached mAP50=0.979 at ep50; head classifier early-stopped at ep7 (F1=0.864).

Not included: data/ (30GB raw RTGS videos β€” proprietary), TrOCR checkpoints (~7GB β€” train via tools/train_plate_ocr_v3.py).

Datasets: vivekvar/cctv-datasets.

Downloads last month
1,376
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support