Motion Tracking

Property Value
Category Object Detection + Multi-Object Tracking
Base Model YOLO26 (Ultralytics) + BoT-SORT tracker
Source Framework PyTorch (Ultralytics)
Supported Precisions FP32, FP16, INT8 (mixed-precision)
Inference Engine OpenVINO
Hardware CPU, GPU, NPU
Detected Class(es) Configurable (default: all 80 COCO classes)

Overview

Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection. It is built on YOLO26, a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:

  • DLStreamer pipeline: YOLO26 FP16 detection via gvadetect + gvatrack element with tracking-type=short-term-imageless.

Each detected object receives a unique track_id that persists across frames as long as the object remains visible. Outputs include per-object trajectories suitable for path analysis, dwell-time computation, and zone-based event triggers.

Typical Metro deployments include:

  • Pedestrian Trajectory Analysis -- map walking paths through stations for flow optimization.
  • Dwell-Time Measurement -- measure how long individuals stay in specific zones.
  • Zone-Based Event Detection -- trigger alerts when tracked objects enter or exit defined areas.
  • Traffic Flow Analytics -- track vehicles through intersections for signal timing optimization.
  • Incident Replay -- reconstruct object paths for post-event forensic review.

Available YOLO26 variants: yolo26n, yolo26s, yolo26m, yolo26l, yolo26x. Smaller variants (yolo26n, yolo26s) are recommended for high-FPS edge deployment. The default tracker is BoT-SORT; ByteTrack is available as an alternative with lower computational overhead.


Prerequisites

Create and activate a Python virtual environment before running the scripts:

python3 -m venv .venv --system-site-packages
source .venv/bin/activate

Note: The --system-site-packages flag is required so the virtual environment can access the system-installed OpenVINO and DLStreamer Python packages.


Getting Started

Download and Quantize Model

Run the provided script to download, export to OpenVINO IR, and optionally quantize:

chmod +x export_and_quantize.sh
./export_and_quantize.sh

This exports the default yolo26n model in FP16 precision.

Optional: Select a Different Variant or Precision

./export_and_quantize.sh yolo26n FP32   # full-precision
./export_and_quantize.sh yolo26n INT8   # quantized
./export_and_quantize.sh yolo26s        # larger variant, default FP16

Replace yolo26n with any variant (yolo26s, yolo26m, yolo26l, yolo26x). The second argument selects the precision (FP32, FP16, INT8); the default is FP16.

The script performs the following steps:

  1. Installs dependencies (openvino, ultralytics; adds nncf for INT8).
  2. Downloads the sample test video (test_video.mp4) and a sample test image (test.jpg).
  3. Downloads the PyTorch weights and exports to OpenVINO IR.
  4. (INT8 only) Quantizes the model using NNCF post-training quantization.

Output files:

  • yolo26n_openvino_model/ -- FP32 or FP16 OpenVINO IR model directory.
  • yolo26n_tracking_int8.xml / yolo26n_tracking_int8.bin -- INT8 quantized model (only when INT8 is selected).

Precision / Device Compatibility

Precision CPU GPU NPU
FP32 Yes Yes No
FP16 Yes Yes Yes
INT8 Yes Yes Yes

Note: The INT8 calibration uses frames from the bundled sample video. For production accuracy, replace it with a representative set of frames from the target deployment site.

DLStreamer Sample

The pipeline below runs the YOLO26 FP16 detector via gvadetect on test_video.mp4, attaches persistent track IDs with gvatrack (short-term-imageless tracker), and overlays bounding boxes with gvawatermark. Frames are pulled from an appsink, per-track trajectory polylines are drawn with OpenCV, and the result is muxed to output_dlstreamer.mp4 (H.264 via ffmpeg).

Notes on running this sample:

  • Use the FP16 IR (yolo26n_openvino_model/yolo26n.xml). Class names are read automatically from the model's embedded metadata.yaml by DLStreamer 2026.0+ -- no external labels-file is required.

  • Export PYTHONPATH so the DLStreamer Python module is importable:

    source /opt/intel/openvino_2026/setupvars.sh
    source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
    export PYTHONPATH=/opt/intel/dlstreamer/python:\
    /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
    
import subprocess
from collections import defaultdict

import cv2
import numpy as np
import gi

gi.require_version("Gst", "1.0")
from gi.repository import Gst
from gstgva import VideoFrame

Gst.init(None)

# For CPU: change device=GPU to device=CPU.
# For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
pipeline_str = (
    "filesrc location=test_video.mp4 ! decodebin3 ! "
    "videoconvert ! "
    "gvadetect model=yolo26n_openvino_model/yolo26n.xml "
    "device=GPU "
    "threshold=0.4 ! queue ! "
    "gvatrack tracking-type=short-term-imageless ! queue ! "
    "gvawatermark ! appsink name=sink emit-signals=false sync=false"
)
pipeline = Gst.parse_launch(pipeline_str)
appsink = pipeline.get_by_name("sink")

# Distinct colors for trajectory lines (one per track ID).
COLORS = [
    (255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0),
    (255, 0, 255), (0, 255, 255), (128, 0, 255), (255, 128, 0),
]
track_history: dict[int, list[tuple[int, int]]] = defaultdict(list)

pipeline.set_state(Gst.State.PLAYING)

proc = None

while True:
    sample = appsink.emit("pull-sample")
    if sample is None:
        break

    buf = sample.get_buffer()
    caps = sample.get_caps()
    struct = caps.get_structure(0)
    width = struct.get_value("width")
    height = struct.get_value("height")

    # Start ffmpeg encoder on the first frame.
    if proc is None:
        ok, fps_num, fps_den = struct.get_fraction("framerate")
        fps = fps_num / fps_den if ok and fps_den > 0 else 30.0
        proc = subprocess.Popen(
            ["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
             "-s", f"{width}x{height}", "-r", str(fps),
             "-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
             "-movflags", "+faststart", "output_dlstreamer.mp4"],
            stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
        )

    # Read detection / tracking metadata.
    frame = VideoFrame(buf, caps=caps)
    regions_data = []
    for region in frame.regions():
        tid = region.object_id()
        label = region.label()
        rect = region.rect()
        cx = int(rect.x + rect.w / 2)
        cy = int(rect.y + rect.h / 2)
        regions_data.append((tid, label, cx, cy))

    # Map buffer read-only and copy pixels to a writable numpy array.
    success, map_info = buf.map(Gst.MapFlags.READ)
    if not success:
        continue
    arr = np.ndarray((height, width, 3), dtype=np.uint8,
                     buffer=map_info.data).copy()
    buf.unmap(map_info)

    # Draw per-track trajectory polylines on the frame copy.
    for tid, label, cx, cy in regions_data:
        track = track_history[tid]
        track.append((cx, cy))
        if len(track) > 30:
            track.pop(0)
        color = COLORS[tid % len(COLORS)]
        pts = np.array(track, dtype=np.int32).reshape((-1, 1, 2))
        cv2.polylines(arr, [pts], False, color, 2)
        print(f"  Track {tid}: {label} center=({cx},{cy})", flush=True)

    proc.stdin.write(arr.tobytes())

pipeline.set_state(Gst.State.NULL)
if proc:
    proc.stdin.close()
    proc.wait()
print("Wrote output_dlstreamer.mp4", flush=True)

Expected Output

DLStreamer expected output

Device targets:

  • device=GPU -- default in the sample code.
  • device=CPU -- change device=GPU to device=CPU.
  • device=NPU -- change device=GPU to device=NPU; use batch-size=1 and nireq=4 for best NPU utilization.

License

Copyright (C) Intel Corporation. All rights reserved. Licensed under the MIT License. See LICENSE for details.

References

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Intel/motion-tracking

Collection including Intel/motion-tracking