Sync license-plate-recognition from metro-analytics-catalog

4ce3d4b verified 8 days ago

9.67 kB

	---
	license: other
	license_link: LICENSE
	library_name: openvino
	pipeline_tag: object-detection
	tags:
	- openvino
	- intel
	- yolov8
	- paddleocr
	- license-plate-recognition
	- ocr
	- edge-ai
	- metro
	- dlstreamer
	language:
	- en
	---

	# License Plate Recognition

	\| Property \| Value \|
	\|---\|---\|
	\| Category \| Object Detection + Optical Character Recognition \|
	\| Source Framework \| PyTorch (Ultralytics YOLOv8), PaddlePaddle (PP-OCRv4) \|
	\| Supported Precisions \| FP32 \|
	\| Inference Engine \| OpenVINO \|
	\| Hardware \| CPU, GPU, NPU \|

	---

	## Overview

	License Plate Recognition (LPR) is a Metro Analytics use case that locates vehicle license plates in a video stream and reads their alphanumeric content.
	The pipeline composes two specialized models:

	- License Plate Detector -- [`yolov8_license_plate_detector`](https://github.com/open-edge-platform/edge-ai-resources), a YOLOv8 model fine-tuned to localize license plates as oriented bounding boxes.
	- OCR Recognizer -- [`ch_PP-OCRv4_rec_infer`](https://github.com/PaddlePaddle/PaddleOCR), the PaddleOCR PP-OCRv4 multilingual text recognizer that converts each cropped plate into a text string.

	Typical Metro deployments include:

	- Tolling and Access Control -- read plates at gates, depots, and parking entries.
	- Vehicle Search and Forensics -- index plates seen at a station for investigative lookup.
	- Fleet and Bus Monitoring -- correlate detected plates with operational schedules.

	The detector returns one bounding box per plate; the OCR stage runs as a downstream classifier on the cropped region, attaching the recognized string as inference metadata on the frame.

	> Note: Plate detector accuracy depends on the regional distribution of training data.
	> The bundled detector was trained primarily on European and US plates.
	> For other regions, fine-tune the YOLOv8 detector on a representative dataset.

	---

	## Prerequisites

	- Python 3.11+
	- [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
	- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html)

	Create and activate a Python virtual environment before running the scripts:

	```bash
	python3 -m venv .venv --system-site-packages
	source .venv/bin/activate
	```

	> Note: The `--system-site-packages` flag is required so the virtual
	> environment can access the system-installed OpenVINO and DLStreamer Python
	> packages.

	Activate the OpenVINO and DLStreamer runtimes in the same shell.
	The DLStreamer Python module is not on `sys.path` by default, so export
	`PYTHONPATH` as well:

	```bash
	source /opt/intel/openvino_2026/setupvars.sh
	source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
	export PYTHONPATH=/opt/intel/dlstreamer/python:\
	/opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
	```

	---

	## Getting Started

	### Download the Models and Sample Video

	Run the provided script to download the license plate detector and OCR
	recognizer OpenVINO IR models and the sample test video:

	```bash
	chmod +x export_and_quantize.sh
	./export_and_quantize.sh
	```

	The script performs the following steps:

	1. Downloads the sample test video (`ParkingVideo.mp4`) from the Intel Edge AI Resources project into the current directory.
	2. Downloads the `license-plate-reader` archive from the Intel Edge AI Resources project and extracts it under `./models/yolov8_license_plate_detector/license-plate-reader/`.
	The archive bundles both the YOLOv8 plate detector (`models/yolov8n/yolov8n_retrained.xml`, FP32) and the converted PaddleOCR recognizer (`models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml`, FP32), so no separate OCR download step is required.

	Both IRs are used as-is at FP32 -- no quantization step is performed.

	### Locating the OCR Recognizer

	The PaddleOCR recognizer ships inside the same archive:

	```text
	./models/yolov8_license_plate_detector/license-plate-reader/models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml
	```

	> Note: PaddleOCR PP-OCRv4 is a CTC sequence model. DLStreamer 2026.0+
	> auto-derives the CTC decoder for the bundled `ch_PP-OCRv4_rec_infer` IR
	> and exposes the decoded plate string as `tensor.label()` on each
	> classified ROI -- no external `model-proc` is required for this sample.
	> For other PaddleOCR variants or non-Latin character sets, supply a custom
	> `model-proc` (see
	> [DLStreamer model_proc reference](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/dev_guide/model_proc_file.html))
	> with the matching character dictionary.

	### DLStreamer Sample

	The sample below builds the two-stage detection plus OCR pipeline using the
	Python GStreamer bindings.
	It mirrors the structure of the upstream
	[DLStreamer `license_plate_recognition.sh`](https://github.com/open-edge-platform/dlstreamer/blob/main/samples/gstreamer/gst_launch/license_plate_recognition/license_plate_recognition.sh)
	sample: `decodebin3 ! queue ! gvadetect ! queue ! videoconvert ! gvaclassify ! queue ! gvawatermark ! ...`.
	The `gvadetect` element runs the license plate detector;
	`gvaclassify` then runs the PaddleOCR recognizer on each detected plate region.
	A buffer probe extracts the recognized text from the inference metadata
	attached to each frame.
	The input is `ParkingVideo.mp4`, the short parking-lot clip downloaded by
	`export_and_quantize.sh` into the current directory.
	The annotated stream is muxed into `output_dlstreamer.mp4` with H.264 (OpenH264).

	```python
	import os

	import gi

	gi.require_version("Gst", "1.0")
	gi.require_version("GstVideo", "1.0")
	from gi.repository import Gst
	from gstgva import VideoFrame

	Gst.init(None)

	MODELS_DIR = os.path.abspath("./models/yolov8_license_plate_detector")
	DETECTOR_XML = (
	f"{MODELS_DIR}/license-plate-reader/models/"
	"yolov8n/yolov8n_retrained.xml"
	)
	OCR_XML = (
	f"{MODELS_DIR}/license-plate-reader/models/"
	"ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml"
	)
	INPUT_VIDEO = "ParkingVideo.mp4"
	DEVICE = "GPU"

	pipeline_str = (
	f"filesrc location={INPUT_VIDEO} ! decodebin3 ! "
	f"videoconvert ! queue ! "
	f"gvadetect model={DETECTOR_XML} device={DEVICE} ! queue ! "
	f"videoconvert ! "
	f"gvaclassify model={OCR_XML} device={DEVICE} ! queue ! "
	f"gvawatermark ! videoconvert ! video/x-raw,format=I420 ! "
	f"openh264enc ! h264parse ! "
	f"mp4mux ! filesink name=sink location=output_dlstreamer.mp4"
	)

	pipeline = Gst.parse_launch(pipeline_str)


	def on_buffer(pad, info):
	buf = info.get_buffer()
	caps = pad.get_current_caps()
	frame = VideoFrame(buf, caps=caps)
	for region in frame.regions():
	rect = region.rect()
	text = ""
	for tensor in region.tensors():
	if tensor.is_detection():
	continue
	try:
	text = tensor.label() or ""
	except RuntimeError:
	continue
	if text:
	break
	if text:
	print(f"Plate: {text} bbox=({rect.x},{rect.y})", flush=True)
	return Gst.PadProbeReturn.OK


	sink = pipeline.get_by_name("sink")
	sink_pad = sink.get_static_pad("sink")
	sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)

	pipeline.set_state(Gst.State.PLAYING)
	bus = pipeline.get_bus()
	bus.timed_pop_filtered(
	Gst.CLOCK_TIME_NONE,
	Gst.MessageType.EOS \| Gst.MessageType.ERROR,
	)
	pipeline.set_state(Gst.State.NULL)
	```

	To run on CPU, change `DEVICE = "GPU"` to `DEVICE = "CPU"`.
	For NPU, change `DEVICE = "GPU"` to `DEVICE = "NPU"` and use
	`batch-size=1` and `nireq=4` for best utilization.

	### Try It on a Sample Video

	`export_and_quantize.sh` already downloaded `ParkingVideo.mp4` into the
	current directory, so the sample is ready to run.
	Execute the DLStreamer sample above.
	The annotated video is saved to `output_dlstreamer.mp4` with green bounding boxes drawn
	by `gvawatermark` around every detected plate.
	The buffer probe prints one line per detected plate per frame.

	Each detected plate that the OCR stage successfully decodes prints one line
	per frame, for example:

	```text
	Plate: 9MRM624 bbox=(979,458)

	```

	Low-confidence ROIs (small, blurred, or partially occluded plates) may yield
	an empty CTC decode and are filtered out by the probe.

	If you only need the structured output and not the annotated video, replace `filesink` with `fakesink` in `pipeline_str` and pipe the console output to a file.

	> Known warning: The `openh264enc` element prints
	> `[OpenH264] this = 0x..., Error:CWelsH264SVCEncoder::EncodeFrame(), cmInitParaError.`
	> on the first frame. This is a benign initialization message — the output
	> video is encoded correctly. The warning comes from the OpenH264 library's
	> internal logging and does not indicate a real error.

	#### Expected Output

	![DLStreamer expected output](expected_output_dlstreamer.gif)

	---

	## License

	Copyright (C) Intel Corporation. All rights reserved.
	Licensed under the MIT License. See [LICENSE](LICENSE) for details.

	## References

	- [DLStreamer License Plate Recognition Sample](https://github.com/open-edge-platform/dlstreamer/tree/main/samples/gstreamer/gst_launch/license_plate_recognition)
	- [Intel Edge AI Resources -- License Plate Reader Model](https://github.com/open-edge-platform/edge-ai-resources)
	- [PaddleOCR PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR)
	- [Ultralytics YOLOv8 Documentation](https://docs.ultralytics.com/models/yolov8/)
	- [OpenVINO Documentation](https://docs.openvino.ai/)
	- [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html)