Intel
/

license-plate-recognition

@@ -1,14 +1,12 @@
-# License Plate Recognition -- Detection and OCR on Intel Hardware
-> **Reference pipeline:** [DLStreamer License Plate Recognition sample](https://github.com/open-edge-platform/dlstreamer/tree/main/samples/gstreamer/gst_launch/license_plate_recognition)
->
-> **Validated with:** OpenVINO 2026.0.0, NNCF 3.0.0, DLStreamer 2025.2, Python 3.11+
 | Property | Value |
 |---|---|
 | **Category** | Object Detection + Optical Character Recognition |
 | **Source Framework** | PyTorch (Ultralytics YOLOv8), PaddlePaddle (PP-OCRv4) |
-| **Supported Precisions** | FP32, FP16-INT8 (detector) |
 | **Inference Engine** | OpenVINO |
 | **Hardware** | CPU, GPU, NPU |
@@ -32,43 +30,55 @@ The detector returns one bounding box per plate; the OCR stage runs as a downstr
 > **Note:** Plate detector accuracy depends on the regional distribution of training data.
 > The bundled detector was trained primarily on European and US plates.
-> For other regions, fine-tune the YOLOv8 detector on a representative dataset before quantization.
 ---
 ## Prerequisites
 - [Install OpenVINO 2026.0.0](https://docs.openvino.ai/2026/get-started/install-openvino.html)
-- [Install Intel DLStreamer](https://dlstreamer.github.io/get_started/install/install-guide-ubuntu.html)
 ---
 ## Getting Started
-### Download and Quantize the Detector
-Run the provided script to download the license plate detector OpenVINO IR and quantize it to INT8:
 ```bash
 chmod +x export_and_quantize.sh
-./export_and_quantize.sh ./models
 ```
 The script performs the following steps:
-1. Installs dependencies (`openvino`, `nncf`).
 2. Downloads the `license-plate-reader` archive from the Intel Edge AI Resources project and extracts it under `./models/yolov8_license_plate_detector/license-plate-reader/`.
-   The archive bundles both the YOLOv8 plate detector (`models/yolov8n/yolov8n_retrained.xml`) and the converted PaddleOCR recognizer (`models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml`), so no separate OCR download step is required.
-3. Quantizes the detector to INT8 using NNCF post-training quantization, producing `./models/yolov8_license_plate_detector/yolov8_license_plate_detector_int8.xml`.
-4. Runs `benchmark_app` to validate detector throughput.
-> **Note:** For production accuracy, replace the random calibration tensors in
-> `export_and_quantize.sh` with a representative sample of frames from the
-> target deployment site.
-> The INT8 detector produced from random calibration in the bundled script may
-> miss small or low-contrast plates; if you need maximum recall before tuning
-> calibration, point the pipeline at the FP32 IR
-> (`models/yolov8_license_plate_detector/license-plate-reader/models/yolov8n/yolov8n_retrained.xml`).
 ### Locating the OCR Recognizer
@@ -78,51 +88,63 @@ The PaddleOCR recognizer ships inside the same archive:
 ./models/yolov8_license_plate_detector/license-plate-reader/models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml
 ```
-> **Note:** PaddleOCR PP-OCRv4 is a CTC sequence model.
-> To convert its raw tensor output into a recognized plate string, DLStreamer's
-> `gvaclassify` element requires a `model-proc` JSON with a CTC decoder
-> converter and a character labels file.
-> Neither file is bundled with the archive nor with the DLStreamer 2026.0.0
-> sample model_procs.
-> Without it the pipeline runs end-to-end and produces per-plate ROI metadata,
-> but the OCR `label` field on each detected plate is an empty string.
-> For a production deployment, supply your own `model-proc` (see
-> [DLStreamer model_proc reference](https://dlstreamer.github.io/dev_guide/model_proc_file.html))
-> with the PaddleOCR character dictionary; until then, treat the OCR stage as
-> a placeholder.
 ### DLStreamer Sample
-The sample below builds the two-stage detection plus OCR pipeline using the Python GStreamer bindings.
-The `gvadetect` element runs the license plate detector; `gvaclassify` then runs the PaddleOCR recognizer on each detected plate region.
-A buffer probe extracts the recognized text from the `GstGVAJSONMeta` payload attached to each frame.
 ```python
-import json
 import os
 import gi
 gi.require_version("Gst", "1.0")
 from gi.repository import Gst
 Gst.init(None)
 MODELS_DIR = os.path.abspath("./models/yolov8_license_plate_detector")
-DETECTOR_XML = f"{MODELS_DIR}/yolov8_license_plate_detector_int8.xml"
 OCR_XML = (
     f"{MODELS_DIR}/license-plate-reader/models/"
     "ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml"
 )
-INPUT_VIDEO = "test_video.mp4"
 pipeline_str = (
-    f"filesrc location={INPUT_VIDEO} ! decodebin3 ! videoconvert ! "
-    f"video/x-raw,format=BGR ! "
-    f"gvadetect model={DETECTOR_XML} device=CPU threshold=0.5 ! queue ! "
-    f"gvaclassify model={OCR_XML} device=CPU inference-region=roi-list ! "
-    f"queue ! gvametaconvert format=json add-tensor-data=false ! "
-    f"gvawatermark ! videoconvert ! autovideosink name=sink"
 )
 pipeline = Gst.parse_launch(pipeline_str)
@@ -130,26 +152,22 @@ pipeline = Gst.parse_launch(pipeline_str)
 def on_buffer(pad, info):
     buf = info.get_buffer()
-    meta_iter = buf.iterate_meta()
-    while True:
-        ok, meta = meta_iter.next()
-        if not ok:
-            break
-        if meta.__gtype__.name != "GstGVAJSONMetaAPI":
-            continue
-        try:
-            payload = json.loads(meta.get_message())
-        except (AttributeError, ValueError):
-            continue
-        for obj in payload.get("objects", []):
-            label = obj.get("detection", {}).get("label", "")
-            text = ""
-            for tensor in obj.get("tensors", []):
-                if tensor.get("layer_name") and "label" in tensor:
-                    text = tensor["label"]
-                    break
-            if label and text:
-                print(f"Plate: {text}  bbox={obj.get('x')},{obj.get('y')}")
     return Gst.PadProbeReturn.OK
@@ -166,50 +184,30 @@ bus.timed_pop_filtered(
 pipeline.set_state(Gst.State.NULL)
 ```
-To run on integrated GPU, change both `device=CPU` properties to `device=GPU` and prepend `vapostproc` after `decodebin3` for zero-copy color conversion.
 ### Try It on a Sample Video
-Download a publicly hosted Intel sample clip that contains vehicles with visible license plates:
-```bash
-wget -O test_video.mp4 \
-  https://github.com/intel-iot-devkit/sample-videos/raw/master/car-detection.mp4
-```
-Run the DLStreamer sample above.
-A window opened by `autovideosink` shows each decoded frame with a green bounding box drawn by `gvawatermark` around every detected plate.
 The buffer probe prints one line per detected plate per frame.
-> **Note:** The INT8 detector built by `export_and_quantize.sh` with random
-> calibration tensors typically detects only one or two plates across this
-> short clip at the documented `threshold=0.5`.
-> For a richer demo run, swap `DETECTOR_XML` to the bundled FP32 IR and lower
-> the threshold:
->
-> ```python
-> DETECTOR_XML = (
->     f"{MODELS_DIR}/license-plate-reader/models/yolov8n/"
->     "yolov8n_retrained.xml"
-> )
-> ```
->
-> and change `threshold=0.5` to `threshold=0.3` in `pipeline_str`.
-Without a custom `model-proc` for PP-OCRv4 (see the OCR note above), the recognized `text` field is empty even though the detector and the OCR network both run on every plate ROI:
 ```text
-Plate:   bbox=395,373
-Plate:   bbox=520,419
-```
-Once you supply a CTC model-proc and PaddleOCR character labels, the same lines will include the decoded plate string, for example:
-```text
-Plate: ABC1234  bbox=812,442
-Plate: ZN98YX   bbox=305,388
 ```
 If you only need the structured output and not the live preview, replace `autovideosink` with `fakesink` in `pipeline_str` and pipe the console output to a file.
 ---
@@ -226,5 +224,4 @@ Licensed under the MIT License. See [LICENSE](LICENSE) for details.
 - [PaddleOCR PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR)
 - [Ultralytics YOLOv8 Documentation](https://docs.ultralytics.com/models/yolov8/)
 - [OpenVINO Documentation](https://docs.openvino.ai/)
-- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
-- [Intel DLStreamer](https://dlstreamer.github.io/)

+# License Plate Recognition
+> **Validated with:** OpenVINO 2026.0.0, DLStreamer 2026.0, Python 3.11+
 | Property | Value |
 |---|---|
 | **Category** | Object Detection + Optical Character Recognition |
 | **Source Framework** | PyTorch (Ultralytics YOLOv8), PaddlePaddle (PP-OCRv4) |
+| **Supported Precisions** | FP32 |
 | **Inference Engine** | OpenVINO |
 | **Hardware** | CPU, GPU, NPU |
 > **Note:** Plate detector accuracy depends on the regional distribution of training data.
 > The bundled detector was trained primarily on European and US plates.
+> For other regions, fine-tune the YOLOv8 detector on a representative dataset.
 ---
 ## Prerequisites
+- Python 3.11+
 - [Install OpenVINO 2026.0.0](https://docs.openvino.ai/2026/get-started/install-openvino.html)
+- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html)
+Create and activate a Python virtual environment before running the scripts:
+```bash
+python3 -m venv .venv --system-site-packages
+source .venv/bin/activate
+```
+Activate the OpenVINO and DLStreamer runtimes in the same shell.
+The DLStreamer Python module is not on `sys.path` by default, so export
+`PYTHONPATH` as well:
+```bash
+source /opt/intel/openvino_2026/setupvars.sh
+source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
+export PYTHONPATH=/opt/intel/dlstreamer/python:\
+/opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
+```
 ---
 ## Getting Started
+### Download the Models and Sample Video
+Run the provided script to download the license plate detector and OCR
+recognizer OpenVINO IR models and the sample test video:
 ```bash
 chmod +x export_and_quantize.sh
+./export_and_quantize.sh
 ```
 The script performs the following steps:
+1. Downloads the sample test video (`ParkingVideo.mp4`) from the Intel Edge AI Resources project into the current directory.
 2. Downloads the `license-plate-reader` archive from the Intel Edge AI Resources project and extracts it under `./models/yolov8_license_plate_detector/license-plate-reader/`.
+   The archive bundles both the YOLOv8 plate detector (`models/yolov8n/yolov8n_retrained.xml`, FP32) and the converted PaddleOCR recognizer (`models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml`, FP32), so no separate OCR download step is required.
+Both IRs are used as-is at FP32 -- no quantization step is performed.
 ### Locating the OCR Recognizer
 ./models/yolov8_license_plate_detector/license-plate-reader/models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml
 ```
+> **Note:** PaddleOCR PP-OCRv4 is a CTC sequence model. DLStreamer 2026.0+
+> auto-derives the CTC decoder for the bundled `ch_PP-OCRv4_rec_infer` IR
+> and exposes the decoded plate string as `tensor.label()` on each
+> classified ROI -- no external `model-proc` is required for this sample.
+> For other PaddleOCR variants or non-Latin character sets, supply a custom
+> `model-proc` (see
+> [DLStreamer model_proc reference](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/dev_guide/model_proc_file.html))
+> with the matching character dictionary.
 ### DLStreamer Sample
+The sample below builds the two-stage detection plus OCR pipeline using the
+Python GStreamer bindings.
+It mirrors the structure of the upstream
+[DLStreamer `license_plate_recognition.sh`](https://github.com/open-edge-platform/dlstreamer/blob/main/samples/gstreamer/gst_launch/license_plate_recognition/license_plate_recognition.sh)
+sample: `decodebin3 ! queue ! gvadetect ! queue ! videoconvert ! gvaclassify ! queue ! gvawatermark ! ...`.
+The `gvadetect` element runs the license plate detector;
+`gvaclassify` then runs the PaddleOCR recognizer on each detected plate region.
+A buffer probe extracts the recognized text from the inference metadata
+attached to each frame.
+The input is `ParkingVideo.mp4`, the short parking-lot clip downloaded by
+`export_and_quantize.sh` into the current directory.
+The annotated stream is muxed into `output.mp4` with H.264 (OpenH264).
 ```python
 import os
 import gi
 gi.require_version("Gst", "1.0")
+gi.require_version("GstVideo", "1.0")
 from gi.repository import Gst
+from gstgva import VideoFrame
 Gst.init(None)
 MODELS_DIR = os.path.abspath("./models/yolov8_license_plate_detector")
+DETECTOR_XML = (
+    f"{MODELS_DIR}/license-plate-reader/models/"
+    "yolov8n/yolov8n_retrained.xml"
+)
 OCR_XML = (
     f"{MODELS_DIR}/license-plate-reader/models/"
     "ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml"
 )
+INPUT_VIDEO = "ParkingVideo.mp4"
+DEVICE = "CPU"
+PREPROC = "pre-process-backend=opencv"
 pipeline_str = (
+    f"filesrc location={INPUT_VIDEO} ! decodebin3 ! queue ! "
+    f"gvadetect model={DETECTOR_XML} device={DEVICE} {PREPROC} ! queue ! "
+    f"videoconvert ! "
+    f"gvaclassify model={OCR_XML} device={DEVICE} {PREPROC} ! queue ! "
+    f"gvawatermark ! videoconvert ! video/x-raw,format=I420 ! "
+    f"openh264enc ! h264parse ! "
+    f"mp4mux ! filesink name=sink location=output.mp4"
 )
 pipeline = Gst.parse_launch(pipeline_str)
 def on_buffer(pad, info):
     buf = info.get_buffer()
+    caps = pad.get_current_caps()
+    frame = VideoFrame(buf, caps=caps)
+    for region in frame.regions():
+        rect = region.rect()
+        text = ""
+        for tensor in region.tensors():
+            if tensor.is_detection():
+                continue
+            try:
+                text = tensor.label() or ""
+            except RuntimeError:
+                continue
+            if text:
+                break
+        if text:
+            print(f"Plate: {text}  bbox=({rect.x},{rect.y})", flush=True)
     return Gst.PadProbeReturn.OK
 pipeline.set_state(Gst.State.NULL)
 ```
+To run on integrated GPU, change `DEVICE = "CPU"` to `DEVICE = "GPU"` and
+switch `PREPROC` to `"pre-process-backend=va-surface-sharing"`, matching the
+upstream sample.
 ### Try It on a Sample Video
+`export_and_quantize.sh` already downloaded `ParkingVideo.mp4` into the
+current directory, so the sample is ready to run.
+Execute the DLStreamer sample above.
+The annotated video is saved to `output.mp4` with green bounding boxes drawn
+by `gvawatermark` around every detected plate.
 The buffer probe prints one line per detected plate per frame.
+Each detected plate that the OCR stage successfully decodes prints one line
+per frame, for example:
 ```text
+Plate: 9MRM624  bbox=(979,458)
 ```
+Low-confidence ROIs (small, blurred, or partially occluded plates) may yield
+an empty CTC decode and are filtered out by the probe.
 If you only need the structured output and not the live preview, replace `autovideosink` with `fakesink` in `pipeline_str` and pipe the console output to a file.
 ---
 - [PaddleOCR PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR)
 - [Ultralytics YOLOv8 Documentation](https://docs.ultralytics.com/models/yolov8/)
 - [OpenVINO Documentation](https://docs.openvino.ai/)
+- [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html)

export_and_quantize.sh CHANGED Viewed

@@ -2,8 +2,8 @@
 # SPDX-License-Identifier: MIT
 # Copyright (C) Intel Corporation
 #
-# Download the YOLOv8 license plate detector, quantize it to INT8 with NNCF,
-# and stage the PaddleOCR PP-OCRv4 recognizer for use with Intel DLStreamer.
 # Usage: ./export_and_quantize.sh [MODELS_DIR]
 # Example: ./export_and_quantize.sh ./models
@@ -12,12 +12,17 @@ set -euo pipefail
 MODELS_DIR="${1:-./models}"
 LP_DETECTOR_NAME="yolov8_license_plate_detector"
 LP_DETECTOR_URL="https://github.com/open-edge-platform/edge-ai-resources/raw/main/models/license-plate-reader.zip"
-OCR_NAME="ch_PP-OCRv4_rec_infer"
 mkdir -p "${MODELS_DIR}"
-echo "--- Installing dependencies ---"
-pip install -qU "openvino>=2026.0.0" "nncf>=3.0.0"
 echo "--- Downloading ${LP_DETECTOR_NAME} (OpenVINO IR) ---"
 LP_DIR="${MODELS_DIR}/${LP_DETECTOR_NAME}"
@@ -37,53 +42,14 @@ if [[ -z "${LP_XML}" ]]; then
 fi
 echo "Found detector model: ${LP_XML}"
-echo "--- Quantizing license plate detector to INT8 with NNCF ---"
-LP_INT8_XML="${LP_DIR}/${LP_DETECTOR_NAME}_int8.xml"
-python3 - <<PY
-import nncf
-import numpy as np
-import openvino as ov
-core = ov.Core()
-model = core.read_model("${LP_XML}")
-input_shape = model.inputs[0].partial_shape
-h = int(input_shape[2].get_length()) if input_shape[2].is_static else 640
-w = int(input_shape[3].get_length()) if input_shape[3].is_static else 640
-def transform_fn(_):
-    return np.random.rand(1, 3, h, w).astype(np.float32)
-calibration_dataset = nncf.Dataset(list(range(300)), transform_fn)
-quantized = nncf.quantize(
-    model,
-    calibration_dataset,
-    preset=nncf.QuantizationPreset.MIXED,
-    subset_size=300,
-)
-ov.save_model(quantized, "${LP_INT8_XML}")
-print("Quantization complete: ${LP_INT8_XML}")
-PY
-echo "--- Staging OCR model (${OCR_NAME}) ---"
-OCR_DIR="${MODELS_DIR}/${OCR_NAME}"
-if [[ -f "${OCR_DIR}/${OCR_NAME}.xml" ]]; then
-    echo "OCR model already present at ${OCR_DIR}"
-else
-    cat <<EOM
-The PaddleOCR PP-OCRv4 recognizer requires Paddle to OpenVINO IR conversion.
-Use the official Intel DLStreamer downloader to fetch and convert it:
-    export MODELS_PATH="\$(pwd)/${MODELS_DIR}"
-    /opt/intel/dlstreamer/samples/download_public_models.sh ${OCR_NAME}
-The converted model will be placed under \${MODELS_PATH}/public/${OCR_NAME}/.
-EOM
 fi
-echo "--- Benchmarking license plate detector ---"
-benchmark_app -m "${LP_INT8_XML}" -d CPU -niter 50 -api async
 echo "--- Done ---"

 # SPDX-License-Identifier: MIT
 # Copyright (C) Intel Corporation
 #
+# Download the YOLOv8 license plate detector and PaddleOCR PP-OCRv4
+# recognizer (both as OpenVINO IR) for use with Intel DLStreamer.
 # Usage: ./export_and_quantize.sh [MODELS_DIR]
 # Example: ./export_and_quantize.sh ./models
 MODELS_DIR="${1:-./models}"
 LP_DETECTOR_NAME="yolov8_license_plate_detector"
 LP_DETECTOR_URL="https://github.com/open-edge-platform/edge-ai-resources/raw/main/models/license-plate-reader.zip"
 mkdir -p "${MODELS_DIR}"
+echo "--- Downloading sample test video ---"
+SAMPLE_VIDEO_URL="https://github.com/open-edge-platform/edge-ai-resources/raw/main/videos/ParkingVideo.mp4"
+if [[ ! -f ParkingVideo.mp4 ]]; then
+    curl -fsSL -o ParkingVideo.mp4 "${SAMPLE_VIDEO_URL}"
+    echo "Downloaded: ParkingVideo.mp4"
+else
+    echo "Already present: ParkingVideo.mp4"
+fi
 echo "--- Downloading ${LP_DETECTOR_NAME} (OpenVINO IR) ---"
 LP_DIR="${MODELS_DIR}/${LP_DETECTOR_NAME}"
 fi
 echo "Found detector model: ${LP_XML}"
+OCR_XML="$(find "${LP_DIR}" -path "*ch_PP-OCRv4_rec_infer*.xml" | head -n1)"
+if [[ -z "${OCR_XML}" ]]; then
+    echo "Error: PaddleOCR PP-OCRv4 .xml not found under ${LP_DIR}" >&2
+    exit 1
 fi
+echo "Found OCR model: ${OCR_XML}"
 echo "--- Done ---"
+echo "Detector : ${LP_XML}"
+echo "OCR      : ${OCR_XML}"
+echo "Sample   : $(pwd)/ParkingVideo.mp4"