Sync motion-tracking from metro-analytics-catalog

e8aa7ae verified about 18 hours ago

10 kB

	---
	license: other
	license_name: intel-custom
	license_link: LICENSE
	library_name: openvino
	pipeline_tag: object-detection
	tags:
	- openvino
	- intel
	- yolo
	- yolo26
	- motion-tracking
	- multi-object-tracking
	- bot-sort
	- edge-ai
	- metro
	- dlstreamer
	datasets:
	- detection-datasets/coco
	language:
	- en
	---

	# Motion Tracking

	\| Property \| Value \|
	\|---\|---\|
	\| Category \| Object Detection + Multi-Object Tracking \|
	\| Base Model \| [YOLO26](https://docs.ultralytics.com/models/yolo26/) (Ultralytics) + [BoT-SORT](https://github.com/NirAharon/BoT-SORT) tracker \|
	\| Source Framework \| PyTorch (Ultralytics) \|
	\| Supported Precisions \| FP32, FP16, INT8 (mixed-precision) \|
	\| Inference Engine \| OpenVINO \|
	\| Hardware \| CPU, GPU, NPU \|
	\| Detected Class(es) \| Configurable (default: all 80 COCO classes) \|

	---

	## Overview

	Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection.
	It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:

	- DLStreamer pipeline: YOLO26 FP16 detection via `gvadetect` + `gvatrack` element with `tracking-type=short-term-imageless`.

	Each detected object receives a unique `track_id` that persists across frames as long as the object remains visible.
	Outputs include per-object trajectories suitable for path analysis, dwell-time computation, and zone-based event triggers.

	Typical Metro deployments include:

	- Pedestrian Trajectory Analysis -- map walking paths through stations for flow optimization.
	- Dwell-Time Measurement -- measure how long individuals stay in specific zones.
	- Zone-Based Event Detection -- trigger alerts when tracked objects enter or exit defined areas.
	- Traffic Flow Analytics -- track vehicles through intersections for signal timing optimization.
	- Incident Replay -- reconstruct object paths for post-event forensic review.

	Available YOLO26 variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`.
	Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment.
	The default tracker is BoT-SORT; ByteTrack is available as an alternative with lower computational overhead.

	---

	## Prerequisites

	- Python 3.11+
	- `ffmpeg` (`sudo apt install ffmpeg`) -- used by both samples to encode output video
	- [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
	- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)

	Create and activate a Python virtual environment before running the scripts:

	```bash
	python3 -m venv .venv --system-site-packages
	source .venv/bin/activate
	```

	> Note: The `--system-site-packages` flag is required so the virtual
	> environment can access the system-installed OpenVINO and DLStreamer Python
	> packages.

	---

	## Getting Started

	### Download and Quantize Model

	Run the provided script to download, export to OpenVINO IR, and optionally quantize:

	```bash
	chmod +x export_and_quantize.sh
	./export_and_quantize.sh
	```

	This exports the default yolo26n model in FP16 precision.

	#### Optional: Select a Different Variant or Precision

	```bash
	./export_and_quantize.sh yolo26n FP32 # full-precision
	./export_and_quantize.sh yolo26n INT8 # quantized
	./export_and_quantize.sh yolo26s # larger variant, default FP16
	```

	Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`).
	The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is FP16.

	The script performs the following steps:

	1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
	2. Downloads the sample test video (`test_video.mp4`) and a sample test image (`test.jpg`).
	3. Downloads the PyTorch weights and exports to OpenVINO IR.
	4. (INT8 only) Quantizes the model using NNCF post-training quantization.

	Output files:

	- `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory.
	- `yolo26n_tracking_int8.xml` / `yolo26n_tracking_int8.bin` -- INT8 quantized model (only when `INT8` is selected).

	#### Precision / Device Compatibility

	\| Precision \| CPU \| GPU \| NPU \|
	\|---\|---\|---\|---\|
	\| FP32 \| Yes \| Yes \| No \|
	\| FP16 \| Yes \| Yes \| Yes \|
	\| INT8 \| Yes \| Yes \| Yes \|

	> Note: The INT8 calibration uses frames from the bundled sample video.
	> For production accuracy, replace it with a representative set of frames from
	> the target deployment site.

	### DLStreamer Sample

	The pipeline below runs the YOLO26 FP16 detector via `gvadetect` on
	`test_video.mp4`, attaches persistent track IDs with `gvatrack`
	(`short-term-imageless` tracker), and overlays bounding boxes with
	`gvawatermark`. Frames are pulled from an `appsink`, per-track trajectory
	polylines are drawn with OpenCV, and the result is muxed to `output_dlstreamer.mp4`
	(H.264 via ffmpeg).

	> Notes on running this sample:
	>
	> - Use the FP16 IR (`yolo26n_openvino_model/yolo26n.xml`). Class names are
	> read automatically from the model's embedded `metadata.yaml` by
	> DLStreamer 2026.0+ -- no external `labels-file` is required.
	> - Export `PYTHONPATH` so the DLStreamer Python module is importable:
	>
	> ```bash
	> source /opt/intel/openvino_2026/setupvars.sh
	> source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
	> export PYTHONPATH=/opt/intel/dlstreamer/python:\
	> /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
	> ```

	```python
	import subprocess
	from collections import defaultdict

	import cv2
	import numpy as np
	import gi

	gi.require_version("Gst", "1.0")
	from gi.repository import Gst
	from gstgva import VideoFrame

	Gst.init(None)

	# For CPU: change device=GPU to device=CPU.
	# For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
	pipeline_str = (
	"filesrc location=test_video.mp4 ! decodebin3 ! "
	"videoconvert ! "
	"gvadetect model=yolo26n_openvino_model/yolo26n.xml "
	"device=GPU "
	"threshold=0.4 ! queue ! "
	"gvatrack tracking-type=short-term-imageless ! queue ! "
	"gvawatermark ! appsink name=sink emit-signals=false sync=false"
	)
	pipeline = Gst.parse_launch(pipeline_str)
	appsink = pipeline.get_by_name("sink")

	# Distinct colors for trajectory lines (one per track ID).
	COLORS = [
	(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0),
	(255, 0, 255), (0, 255, 255), (128, 0, 255), (255, 128, 0),
	]
	track_history: dict[int, list[tuple[int, int]]] = defaultdict(list)

	pipeline.set_state(Gst.State.PLAYING)

	proc = None

	while True:
	sample = appsink.emit("pull-sample")
	if sample is None:
	break

	buf = sample.get_buffer()
	caps = sample.get_caps()
	struct = caps.get_structure(0)
	width = struct.get_value("width")
	height = struct.get_value("height")

	# Start ffmpeg encoder on the first frame.
	if proc is None:
	ok, fps_num, fps_den = struct.get_fraction("framerate")
	fps = fps_num / fps_den if ok and fps_den > 0 else 30.0
	proc = subprocess.Popen(
	["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
	"-s", f"{width}x{height}", "-r", str(fps),
	"-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
	"-movflags", "+faststart", "output_dlstreamer.mp4"],
	stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
	)

	# Read detection / tracking metadata.
	frame = VideoFrame(buf, caps=caps)
	regions_data = []
	for region in frame.regions():
	tid = region.object_id()
	label = region.label()
	rect = region.rect()
	cx = int(rect.x + rect.w / 2)
	cy = int(rect.y + rect.h / 2)
	regions_data.append((tid, label, cx, cy))

	# Map buffer read-only and copy pixels to a writable numpy array.
	success, map_info = buf.map(Gst.MapFlags.READ)
	if not success:
	continue
	arr = np.ndarray((height, width, 3), dtype=np.uint8,
	buffer=map_info.data).copy()
	buf.unmap(map_info)

	# Draw per-track trajectory polylines on the frame copy.
	for tid, label, cx, cy in regions_data:
	track = track_history[tid]
	track.append((cx, cy))
	if len(track) > 30:
	track.pop(0)
	color = COLORS[tid % len(COLORS)]
	pts = np.array(track, dtype=np.int32).reshape((-1, 1, 2))
	cv2.polylines(arr, [pts], False, color, 2)
	print(f" Track {tid}: {label} center=({cx},{cy})", flush=True)

	proc.stdin.write(arr.tobytes())

	pipeline.set_state(Gst.State.NULL)
	if proc:
	proc.stdin.close()
	proc.wait()
	print("Wrote output_dlstreamer.mp4", flush=True)
	```

	#### Expected Output

	![DLStreamer expected output](expected_output_dlstreamer.gif)

	Device targets:

	- `device=GPU` -- default in the sample code.
	- `device=CPU` -- change `device=GPU` to `device=CPU`.
	- `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization.

	---

	## License

	Copyright (C) Intel Corporation. All rights reserved.
	Licensed under the MIT License. See [LICENSE](LICENSE) for details.

	## References

	- [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/)
	- [Ultralytics Multi-Object Tracking](https://docs.ultralytics.com/modes/track/)
	- [BoT-SORT Tracker](https://github.com/NirAharon/BoT-SORT)
	- [ByteTrack Tracker](https://github.com/FoundationVision/ByteTrack)
	- [Intel DLStreamer gvatrack](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/elements/gvatrack.html)
	- [OpenVINO Documentation](https://docs.openvino.ai/)
	- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
	- [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html)