File size: 11,534 Bytes

---
license: other
license_link: LICENSE
library_name: openvino
pipeline_tag: object-detection
tags:
  - openvino
  - intel
  - yolo
  - yolo26
  - vehicle-detection
  - edge-ai
  - metro
  - dlstreamer
datasets:
  - detection-datasets/coco
language:
  - en
---

# Vehicle Detection

| Property | Value |
|---|---|
| **Category** | Object Detection (Vehicle Detection) |
| **Base Model** | [YOLO26](https://docs.ultralytics.com/models/yolo26/) (Ultralytics) |
| **Source Framework** | PyTorch (Ultralytics) |
| **Supported Precisions** | FP32, FP16, INT8 (mixed-precision) |
| **Inference Engine** | OpenVINO |
| **Hardware** | CPU, GPU, NPU |
| **Detected Class(es)** | `car` (2), `motorcycle` (3), `bus` (5), `truck` (7) |

---

## Overview

Vehicle Detection is a Metro Analytics use case that detects and localizes vehicles in images and video streams.
It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector trained on the COCO dataset, quantized to INT8 and filtered at runtime to vehicle-related classes: `car`, `motorcycle`, `bus`, and `truck`.

Typical Metro deployments include:

- **Traffic Monitoring** -- count vehicles on roads, intersections, and highway ramps.
- **Parking Lot Occupancy** -- detect available spaces in parking structures.
- **Toll Gate Analytics** -- classify vehicle types at toll collection points.
- **Fleet Tracking** -- monitor bus and truck movements at depots and terminals.
- **Incident Detection** -- flag stopped or wrong-way vehicles on roadways.

Available variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`.
Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment; larger variants improve recall for distant or partially occluded vehicles.

---

## Prerequisites

- Python 3.11+
- [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)

Create and activate a Python virtual environment before running the scripts:

```bash
python3 -m venv .venv --system-site-packages
source .venv/bin/activate
```

> **Note:** The `--system-site-packages` flag is required so the virtual
> environment can access the system-installed OpenVINO and DLStreamer Python
> packages.

---

## Getting Started

### Download and Quantize Model

Run the provided script to download, export to OpenVINO IR, and optionally quantize:

```bash
chmod +x export_and_quantize.sh
./export_and_quantize.sh
```

This exports the default **yolo26n** model in **FP16** precision.

#### Optional: Select a Different Variant or Precision

```bash
./export_and_quantize.sh yolo26n FP32   # full-precision
./export_and_quantize.sh yolo26n INT8   # quantized
./export_and_quantize.sh yolo26s        # larger variant, default FP16
```

Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`).
The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is **FP16**.

The script performs the following steps:

1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
2. Downloads a sample test image (`test.jpg`) and a sample test video (`test_video.mp4`).
3. Downloads the PyTorch weights and exports to OpenVINO IR.
4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.

Output files:

- `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory.
- `yolo26n_vehicle_int8.xml` / `yolo26n_vehicle_int8.bin` -- INT8 quantized model *(only when `INT8` is selected)*.

#### Precision / Device Compatibility

| Precision | CPU | GPU | NPU |
|---|---|---|---|
| FP32 | Yes | Yes | No |
| FP16 | Yes | Yes | Yes |
| INT8 | Yes | Yes | Yes |

> **Note:** The INT8 calibration uses the bundled sample image.
> For production accuracy, replace it with a representative set of frames from
> the target deployment site.

### OpenVINO Sample

The sample below runs YOLO26 inference, filters to vehicle classes (`car`,
`motorcycle`, `bus`, `truck`), and reports the vehicle count for a single image.
YOLO26 is end-to-end (NMS-free), so no manual non-maximum suppression is needed.
Change the `device` string to run on CPU, GPU, or NPU.

```python
import cv2
import numpy as np
import openvino as ov

VEHICLE_CLASS_IDS = {2: "car", 3: "motorcycle", 5: "bus", 7: "truck"}
CONF_THRESHOLD = 0.4
INPUT_SIZE = 640

core = ov.Core()
model = core.read_model("yolo26n_openvino_model/yolo26n.xml")

# Change device to "GPU" or "NPU" to run on integrated GPU or NPU.
compiled = core.compile_model(model, "CPU")

image = cv2.imread("test.jpg")
h0, w0 = image.shape[:2]

blob = cv2.resize(image, (INPUT_SIZE, INPUT_SIZE))
blob = cv2.cvtColor(blob, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
blob = blob.transpose(2, 0, 1)[np.newaxis, ...]  # NCHW

# YOLO26 end-to-end output: [1, 300, 6] = [x1, y1, x2, y2, confidence, class_id]
output = compiled([blob])[compiled.output(0)][0]
mask = (output[:, 4] >= CONF_THRESHOLD) & np.isin(output[:, 5].astype(int), list(VEHICLE_CLASS_IDS.keys()))
dets = output[mask]

sx, sy = w0 / INPUT_SIZE, h0 / INPUT_SIZE
vehicle_count = len(dets)
print(f"Detected vehicles: {vehicle_count}")

colors = {"car": (0, 255, 0), "motorcycle": (255, 0, 0),
          "bus": (0, 165, 255), "truck": (0, 0, 255)}
for det in dets:
    x1 = int(det[0] * sx)
    y1 = int(det[1] * sy)
    x2 = int(det[2] * sx)
    y2 = int(det[3] * sy)
    cid = int(det[5])
    conf = float(det[4])
    name = VEHICLE_CLASS_IDS[cid]
    label = f"{name} {conf:.2f}"
    color = colors[name]
    cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)
    cv2.putText(image, label, (x1, y1 - 5),
                cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
    print(f"  {label} at ({x1},{y1})-({x2},{y2})")

cv2.imwrite("output_openvino.jpg", image)
```

**Device targets:**

- `"CPU"` -- default, works on all Intel platforms.
- `"GPU"` -- Intel integrated or discrete GPU.
- `"NPU"` -- Intel NPU (validate with `benchmark_app -d NPU`).

### Try It on a Sample Image

The `export_and_quantize.sh` script downloads `test.jpg` automatically.
Re-run the OpenVINO sample above.
The script reads `test.jpg`, prints each detected vehicle to the console, and writes the annotated frame to `output_openvino.jpg`.

Expected console output:

```text
Detected vehicles: 1
  bus 0.92 at (0,229)-(804,744)
```

#### Expected Output

![OpenVINO expected output](expected_output_openvino.jpg)

### DLStreamer Sample

The pipeline below runs the FP16 YOLO26 detector on the sample video via
`gvadetect`, filters detections to vehicle classes using the DLStreamer
Python bindings (`gstgva.VideoFrame`), draws only vehicle bounding boxes
with OpenCV, saves the annotated result to `output_dlstreamer.mp4`, and
prints the vehicle count per frame.

> **Notes on running this sample:**
>
> - Use the FP16 IR (`yolo26n_openvino_model/yolo26n.xml`). Class names are
>   read automatically from the model's embedded `metadata.yaml` by
>   DLStreamer 2026.0+ -- no external `labels-file` is required.
> - Export `PYTHONPATH` so the DLStreamer Python module is importable:
>
>   ```bash
>   source /opt/intel/openvino_2026/setupvars.sh
>   source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
>   export PYTHONPATH=/opt/intel/dlstreamer/python:\
>   /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
>   ```

```python
import subprocess

import cv2
import numpy as np
import gi

gi.require_version("Gst", "1.0")
from gi.repository import Gst
from gstgva import VideoFrame

Gst.init(None)

INPUT_VIDEO = "test_video.mp4"
VEHICLE_LABELS = {"car", "motorcycle", "bus", "truck"}
COLORS = {
    "car": (0, 255, 0), "motorcycle": (255, 128, 0),
    "bus": (0, 128, 255), "truck": (128, 0, 255),
}

# For CPU: change device=GPU to device=CPU.
# For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
pipeline_str = (
    f"filesrc location={INPUT_VIDEO} ! decodebin3 ! "
    "videoconvert ! "
    "gvadetect model=yolo26n_openvino_model/yolo26n.xml "
    "device=GPU "
    "threshold=0.4 ! queue ! "
    "videoconvert ! video/x-raw,format=BGR ! "
    "appsink name=sink emit-signals=false sync=false"
)
pipeline = Gst.parse_launch(pipeline_str)
appsink = pipeline.get_by_name("sink")

pipeline.set_state(Gst.State.PLAYING)

proc = None

while True:
    sample = appsink.emit("pull-sample")
    if sample is None:
        break

    buf = sample.get_buffer()
    caps = sample.get_caps()
    struct = caps.get_structure(0)
    width = struct.get_value("width")
    height = struct.get_value("height")

    # Start ffmpeg encoder on the first frame.
    if proc is None:
        ok, fps_num, fps_den = struct.get_fraction("framerate")
        fps = fps_num / fps_den if ok and fps_den > 0 else 30.0
        proc = subprocess.Popen(
            ["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
             "-s", f"{width}x{height}", "-r", str(fps),
             "-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
             "-movflags", "+faststart", "output_dlstreamer.mp4"],
            stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
        )

    # Read detection metadata and filter to vehicle classes.
    frame = VideoFrame(buf, caps=caps)
    vehicles = [(r.label(), r.rect()) for r in frame.regions()
                if r.label() in VEHICLE_LABELS]

    # Map buffer read-only and copy pixels to a writable numpy array.
    success, map_info = buf.map(Gst.MapFlags.READ)
    if not success:
        continue
    arr = np.ndarray((height, width, 3), dtype=np.uint8,
                     buffer=map_info.data).copy()
    buf.unmap(map_info)

    # Draw vehicle bounding boxes only.
    for label, rect in vehicles:
        x1, y1 = int(rect.x), int(rect.y)
        x2, y2 = int(rect.x + rect.w), int(rect.y + rect.h)
        color = COLORS.get(label, (0, 255, 0))
        cv2.rectangle(arr, (x1, y1), (x2, y2), color, 2)
        cv2.putText(arr, label, (x1, max(y1 - 6, 0)),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)

    if vehicles:
        print(f"Vehicle count: {len(vehicles)}", flush=True)
        for label, rect in vehicles:
            print(f"  {label} at ({int(rect.x)},{int(rect.y)})", flush=True)

    proc.stdin.write(arr.tobytes())

pipeline.set_state(Gst.State.NULL)
if proc:
    proc.stdin.close()
    proc.wait()
print("Wrote output_dlstreamer.mp4", flush=True)
```

#### Expected Output

![DLStreamer expected output](expected_output_dlstreamer.gif)

**Device targets:**

- `device=GPU` -- default in the sample code.
- `device=CPU` -- change `device=GPU` to `device=CPU`.
- `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization.

---

## License

Copyright (C) Intel Corporation. All rights reserved.
Licensed under the MIT License. See [LICENSE](LICENSE) for details.

## References

- [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/)
- [OpenVINO YOLO26 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov26-optimization/yolov26-object-detection.ipynb)
- [COCO Dataset](https://cocodataset.org/)
- [OpenVINO Documentation](https://docs.openvino.ai/)
- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
- [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html)