motion-tracking / README.md
vagheshpatel's picture
Sync motion-tracking from metro-analytics-catalog
e8aa7ae verified
|
raw
history blame
10 kB
---
license: other
license_name: intel-custom
license_link: LICENSE
library_name: openvino
pipeline_tag: object-detection
tags:
- openvino
- intel
- yolo
- yolo26
- motion-tracking
- multi-object-tracking
- bot-sort
- edge-ai
- metro
- dlstreamer
datasets:
- detection-datasets/coco
language:
- en
---
# Motion Tracking
| Property | Value |
|---|---|
| **Category** | Object Detection + Multi-Object Tracking |
| **Base Model** | [YOLO26](https://docs.ultralytics.com/models/yolo26/) (Ultralytics) + [BoT-SORT](https://github.com/NirAharon/BoT-SORT) tracker |
| **Source Framework** | PyTorch (Ultralytics) |
| **Supported Precisions** | FP32, FP16, INT8 (mixed-precision) |
| **Inference Engine** | OpenVINO |
| **Hardware** | CPU, GPU, NPU |
| **Detected Class(es)** | Configurable (default: all 80 COCO classes) |
---
## Overview
Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection.
It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:
- **DLStreamer pipeline:** YOLO26 FP16 detection via `gvadetect` + `gvatrack` element with `tracking-type=short-term-imageless`.
Each detected object receives a unique `track_id` that persists across frames as long as the object remains visible.
Outputs include per-object trajectories suitable for path analysis, dwell-time computation, and zone-based event triggers.
Typical Metro deployments include:
- **Pedestrian Trajectory Analysis** -- map walking paths through stations for flow optimization.
- **Dwell-Time Measurement** -- measure how long individuals stay in specific zones.
- **Zone-Based Event Detection** -- trigger alerts when tracked objects enter or exit defined areas.
- **Traffic Flow Analytics** -- track vehicles through intersections for signal timing optimization.
- **Incident Replay** -- reconstruct object paths for post-event forensic review.
Available YOLO26 variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`.
Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment.
The default tracker is BoT-SORT; ByteTrack is available as an alternative with lower computational overhead.
---
## Prerequisites
- Python 3.11+
- `ffmpeg` (`sudo apt install ffmpeg`) -- used by both samples to encode output video
- [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)
Create and activate a Python virtual environment before running the scripts:
```bash
python3 -m venv .venv --system-site-packages
source .venv/bin/activate
```
> **Note:** The `--system-site-packages` flag is required so the virtual
> environment can access the system-installed OpenVINO and DLStreamer Python
> packages.
---
## Getting Started
### Download and Quantize Model
Run the provided script to download, export to OpenVINO IR, and optionally quantize:
```bash
chmod +x export_and_quantize.sh
./export_and_quantize.sh
```
This exports the default **yolo26n** model in **FP16** precision.
#### Optional: Select a Different Variant or Precision
```bash
./export_and_quantize.sh yolo26n FP32 # full-precision
./export_and_quantize.sh yolo26n INT8 # quantized
./export_and_quantize.sh yolo26s # larger variant, default FP16
```
Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`).
The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is **FP16**.
The script performs the following steps:
1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
2. Downloads the sample test video (`test_video.mp4`) and a sample test image (`test.jpg`).
3. Downloads the PyTorch weights and exports to OpenVINO IR.
4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.
Output files:
- `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory.
- `yolo26n_tracking_int8.xml` / `yolo26n_tracking_int8.bin` -- INT8 quantized model *(only when `INT8` is selected)*.
#### Precision / Device Compatibility
| Precision | CPU | GPU | NPU |
|---|---|---|---|
| FP32 | Yes | Yes | No |
| FP16 | Yes | Yes | Yes |
| INT8 | Yes | Yes | Yes |
> **Note:** The INT8 calibration uses frames from the bundled sample video.
> For production accuracy, replace it with a representative set of frames from
> the target deployment site.
### DLStreamer Sample
The pipeline below runs the YOLO26 FP16 detector via `gvadetect` on
`test_video.mp4`, attaches persistent track IDs with `gvatrack`
(`short-term-imageless` tracker), and overlays bounding boxes with
`gvawatermark`. Frames are pulled from an `appsink`, per-track trajectory
polylines are drawn with OpenCV, and the result is muxed to `output_dlstreamer.mp4`
(H.264 via ffmpeg).
> **Notes on running this sample:**
>
> - Use the FP16 IR (`yolo26n_openvino_model/yolo26n.xml`). Class names are
> read automatically from the model's embedded `metadata.yaml` by
> DLStreamer 2026.0+ -- no external `labels-file` is required.
> - Export `PYTHONPATH` so the DLStreamer Python module is importable:
>
> ```bash
> source /opt/intel/openvino_2026/setupvars.sh
> source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
> export PYTHONPATH=/opt/intel/dlstreamer/python:\
> /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
> ```
```python
import subprocess
from collections import defaultdict
import cv2
import numpy as np
import gi
gi.require_version("Gst", "1.0")
from gi.repository import Gst
from gstgva import VideoFrame
Gst.init(None)
# For CPU: change device=GPU to device=CPU.
# For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
pipeline_str = (
"filesrc location=test_video.mp4 ! decodebin3 ! "
"videoconvert ! "
"gvadetect model=yolo26n_openvino_model/yolo26n.xml "
"device=GPU "
"threshold=0.4 ! queue ! "
"gvatrack tracking-type=short-term-imageless ! queue ! "
"gvawatermark ! appsink name=sink emit-signals=false sync=false"
)
pipeline = Gst.parse_launch(pipeline_str)
appsink = pipeline.get_by_name("sink")
# Distinct colors for trajectory lines (one per track ID).
COLORS = [
(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0),
(255, 0, 255), (0, 255, 255), (128, 0, 255), (255, 128, 0),
]
track_history: dict[int, list[tuple[int, int]]] = defaultdict(list)
pipeline.set_state(Gst.State.PLAYING)
proc = None
while True:
sample = appsink.emit("pull-sample")
if sample is None:
break
buf = sample.get_buffer()
caps = sample.get_caps()
struct = caps.get_structure(0)
width = struct.get_value("width")
height = struct.get_value("height")
# Start ffmpeg encoder on the first frame.
if proc is None:
ok, fps_num, fps_den = struct.get_fraction("framerate")
fps = fps_num / fps_den if ok and fps_den > 0 else 30.0
proc = subprocess.Popen(
["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
"-s", f"{width}x{height}", "-r", str(fps),
"-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
"-movflags", "+faststart", "output_dlstreamer.mp4"],
stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
)
# Read detection / tracking metadata.
frame = VideoFrame(buf, caps=caps)
regions_data = []
for region in frame.regions():
tid = region.object_id()
label = region.label()
rect = region.rect()
cx = int(rect.x + rect.w / 2)
cy = int(rect.y + rect.h / 2)
regions_data.append((tid, label, cx, cy))
# Map buffer read-only and copy pixels to a writable numpy array.
success, map_info = buf.map(Gst.MapFlags.READ)
if not success:
continue
arr = np.ndarray((height, width, 3), dtype=np.uint8,
buffer=map_info.data).copy()
buf.unmap(map_info)
# Draw per-track trajectory polylines on the frame copy.
for tid, label, cx, cy in regions_data:
track = track_history[tid]
track.append((cx, cy))
if len(track) > 30:
track.pop(0)
color = COLORS[tid % len(COLORS)]
pts = np.array(track, dtype=np.int32).reshape((-1, 1, 2))
cv2.polylines(arr, [pts], False, color, 2)
print(f" Track {tid}: {label} center=({cx},{cy})", flush=True)
proc.stdin.write(arr.tobytes())
pipeline.set_state(Gst.State.NULL)
if proc:
proc.stdin.close()
proc.wait()
print("Wrote output_dlstreamer.mp4", flush=True)
```
#### Expected Output
![DLStreamer expected output](expected_output_dlstreamer.gif)
**Device targets:**
- `device=GPU` -- default in the sample code.
- `device=CPU` -- change `device=GPU` to `device=CPU`.
- `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization.
---
## License
Copyright (C) Intel Corporation. All rights reserved.
Licensed under the MIT License. See [LICENSE](LICENSE) for details.
## References
- [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/)
- [Ultralytics Multi-Object Tracking](https://docs.ultralytics.com/modes/track/)
- [BoT-SORT Tracker](https://github.com/NirAharon/BoT-SORT)
- [ByteTrack Tracker](https://github.com/FoundationVision/ByteTrack)
- [Intel DLStreamer gvatrack](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/elements/gvatrack.html)
- [OpenVINO Documentation](https://docs.openvino.ai/)
- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
- [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html)