Motion Tracking
| Property | Value |
|---|---|
| Category | Object Detection + Multi-Object Tracking |
| Base Model | YOLO26 (Ultralytics) + BoT-SORT tracker |
| Source Framework | PyTorch (Ultralytics) |
| Supported Precisions | FP32, FP16, INT8 (mixed-precision) |
| Inference Engine | OpenVINO |
| Hardware | CPU, GPU, NPU |
| Detected Class(es) | Configurable (default: all 80 COCO classes) |
Overview
Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection. It is built on YOLO26, a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:
- DLStreamer pipeline: YOLO26 FP16 detection via
gvadetect+gvatrackelement withtracking-type=short-term-imageless.
Each detected object receives a unique track_id that persists across frames as long as the object remains visible.
Outputs include per-object trajectories suitable for path analysis, dwell-time computation, and zone-based event triggers.
Typical Metro deployments include:
- Pedestrian Trajectory Analysis -- map walking paths through stations for flow optimization.
- Dwell-Time Measurement -- measure how long individuals stay in specific zones.
- Zone-Based Event Detection -- trigger alerts when tracked objects enter or exit defined areas.
- Traffic Flow Analytics -- track vehicles through intersections for signal timing optimization.
- Incident Replay -- reconstruct object paths for post-event forensic review.
Available YOLO26 variants: yolo26n, yolo26s, yolo26m, yolo26l, yolo26x.
Smaller variants (yolo26n, yolo26s) are recommended for high-FPS edge deployment.
The default tracker is BoT-SORT; ByteTrack is available as an alternative with lower computational overhead.
Prerequisites
- Python 3.11+
ffmpeg(sudo apt install ffmpeg) -- used by both samples to encode output video- Install OpenVINO (latest version)
- Install Intel DLStreamer (latest version)
Create and activate a Python virtual environment before running the scripts:
python3 -m venv .venv --system-site-packages
source .venv/bin/activate
Note: The
--system-site-packagesflag is required so the virtual environment can access the system-installed OpenVINO and DLStreamer Python packages.
Getting Started
Download and Quantize Model
Run the provided script to download, export to OpenVINO IR, and optionally quantize:
chmod +x export_and_quantize.sh
./export_and_quantize.sh
This exports the default yolo26n model in FP16 precision.
Optional: Select a Different Variant or Precision
./export_and_quantize.sh yolo26n FP32 # full-precision
./export_and_quantize.sh yolo26n INT8 # quantized
./export_and_quantize.sh yolo26s # larger variant, default FP16
Replace yolo26n with any variant (yolo26s, yolo26m, yolo26l, yolo26x).
The second argument selects the precision (FP32, FP16, INT8); the default is FP16.
The script performs the following steps:
- Installs dependencies (
openvino,ultralytics; addsnncffor INT8). - Downloads the sample test video (
test_video.mp4) and a sample test image (test.jpg). - Downloads the PyTorch weights and exports to OpenVINO IR.
- (INT8 only) Quantizes the model using NNCF post-training quantization.
Output files:
yolo26n_openvino_model/-- FP32 or FP16 OpenVINO IR model directory.yolo26n_tracking_int8.xml/yolo26n_tracking_int8.bin-- INT8 quantized model (only whenINT8is selected).
Precision / Device Compatibility
| Precision | CPU | GPU | NPU |
|---|---|---|---|
| FP32 | Yes | Yes | No |
| FP16 | Yes | Yes | Yes |
| INT8 | Yes | Yes | Yes |
Note: The INT8 calibration uses frames from the bundled sample video. For production accuracy, replace it with a representative set of frames from the target deployment site.
DLStreamer Sample
The pipeline below runs the YOLO26 FP16 detector via gvadetect on
test_video.mp4, attaches persistent track IDs with gvatrack
(short-term-imageless tracker), and overlays bounding boxes with
gvawatermark. Frames are pulled from an appsink, per-track trajectory
polylines are drawn with OpenCV, and the result is muxed to output_dlstreamer.mp4
(H.264 via ffmpeg).
Notes on running this sample:
Use the FP16 IR (
yolo26n_openvino_model/yolo26n.xml). Class names are read automatically from the model's embeddedmetadata.yamlby DLStreamer 2026.0+ -- no externallabels-fileis required.Export
PYTHONPATHso the DLStreamer Python module is importable:source /opt/intel/openvino_2026/setupvars.sh source /opt/intel/dlstreamer/scripts/setup_dls_env.sh export PYTHONPATH=/opt/intel/dlstreamer/python:\ /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
import subprocess
from collections import defaultdict
import cv2
import numpy as np
import gi
gi.require_version("Gst", "1.0")
from gi.repository import Gst
from gstgva import VideoFrame
Gst.init(None)
# For CPU: change device=GPU to device=CPU.
# For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
pipeline_str = (
"filesrc location=test_video.mp4 ! decodebin3 ! "
"videoconvert ! "
"gvadetect model=yolo26n_openvino_model/yolo26n.xml "
"device=GPU "
"threshold=0.4 ! queue ! "
"gvatrack tracking-type=short-term-imageless ! queue ! "
"gvawatermark ! appsink name=sink emit-signals=false sync=false"
)
pipeline = Gst.parse_launch(pipeline_str)
appsink = pipeline.get_by_name("sink")
# Distinct colors for trajectory lines (one per track ID).
COLORS = [
(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0),
(255, 0, 255), (0, 255, 255), (128, 0, 255), (255, 128, 0),
]
track_history: dict[int, list[tuple[int, int]]] = defaultdict(list)
pipeline.set_state(Gst.State.PLAYING)
proc = None
while True:
sample = appsink.emit("pull-sample")
if sample is None:
break
buf = sample.get_buffer()
caps = sample.get_caps()
struct = caps.get_structure(0)
width = struct.get_value("width")
height = struct.get_value("height")
# Start ffmpeg encoder on the first frame.
if proc is None:
ok, fps_num, fps_den = struct.get_fraction("framerate")
fps = fps_num / fps_den if ok and fps_den > 0 else 30.0
proc = subprocess.Popen(
["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
"-s", f"{width}x{height}", "-r", str(fps),
"-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
"-movflags", "+faststart", "output_dlstreamer.mp4"],
stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
)
# Read detection / tracking metadata.
frame = VideoFrame(buf, caps=caps)
regions_data = []
for region in frame.regions():
tid = region.object_id()
label = region.label()
rect = region.rect()
cx = int(rect.x + rect.w / 2)
cy = int(rect.y + rect.h / 2)
regions_data.append((tid, label, cx, cy))
# Map buffer read-only and copy pixels to a writable numpy array.
success, map_info = buf.map(Gst.MapFlags.READ)
if not success:
continue
arr = np.ndarray((height, width, 3), dtype=np.uint8,
buffer=map_info.data).copy()
buf.unmap(map_info)
# Draw per-track trajectory polylines on the frame copy.
for tid, label, cx, cy in regions_data:
track = track_history[tid]
track.append((cx, cy))
if len(track) > 30:
track.pop(0)
color = COLORS[tid % len(COLORS)]
pts = np.array(track, dtype=np.int32).reshape((-1, 1, 2))
cv2.polylines(arr, [pts], False, color, 2)
print(f" Track {tid}: {label} center=({cx},{cy})", flush=True)
proc.stdin.write(arr.tobytes())
pipeline.set_state(Gst.State.NULL)
if proc:
proc.stdin.close()
proc.wait()
print("Wrote output_dlstreamer.mp4", flush=True)
Expected Output
Device targets:
device=GPU-- default in the sample code.device=CPU-- changedevice=GPUtodevice=CPU.device=NPU-- changedevice=GPUtodevice=NPU; usebatch-size=1andnireq=4for best NPU utilization.
License
Copyright (C) Intel Corporation. All rights reserved. Licensed under the MIT License. See LICENSE for details.