Sync motion-tracking from metro-analytics-catalog
Browse files- .gitattributes +2 -0
- README.md +18 -133
- expected_output_dlstreamer.gif +3 -0
- expected_output_openvino.gif +3 -0
- export_and_quantize.sh +8 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
expected_output_dlstreamer.gif filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
expected_output_openvino.gif filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,7 +1,5 @@
|
|
| 1 |
# Motion Tracking
|
| 2 |
|
| 3 |
-
> **Validated with:** OpenVINO 2026.1.0, NNCF 3.0.0, DLStreamer 2026.0, Ultralytics 8.4.46, Python 3.11+
|
| 4 |
-
|
| 5 |
| Property | Value |
|
| 6 |
|---|---|
|
| 7 |
| **Category** | Object Detection + Multi-Object Tracking |
|
|
@@ -19,7 +17,6 @@
|
|
| 19 |
Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection.
|
| 20 |
It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:
|
| 21 |
|
| 22 |
-
- **OpenVINO pipeline:** YOLO26 INT8 detection + Ultralytics built-in [BoT-SORT](https://github.com/NirAharon/BoT-SORT) or [ByteTrack](https://github.com/FoundationVision/ByteTrack) tracker via `model.track()`.
|
| 23 |
- **DLStreamer pipeline:** YOLO26 FP16 detection via `gvadetect` + `gvatrack` element with `tracking-type=short-term-imageless`.
|
| 24 |
|
| 25 |
Each detected object receives a unique `track_id` that persists across frames as long as the object remains visible.
|
|
@@ -42,7 +39,7 @@ The default tracker is BoT-SORT; ByteTrack is available as an alternative with l
|
|
| 42 |
## Prerequisites
|
| 43 |
|
| 44 |
- Python 3.11+
|
| 45 |
-
- `ffmpeg` (`sudo apt install ffmpeg`)
|
| 46 |
- [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
|
| 47 |
- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)
|
| 48 |
|
|
@@ -86,7 +83,7 @@ The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default
|
|
| 86 |
The script performs the following steps:
|
| 87 |
|
| 88 |
1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
|
| 89 |
-
2. Downloads the sample test video (`test_video.mp4`).
|
| 90 |
3. Downloads the PyTorch weights and exports to OpenVINO IR.
|
| 91 |
4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.
|
| 92 |
|
|
@@ -107,128 +104,13 @@ Output files:
|
|
| 107 |
> For production accuracy, replace it with a representative set of frames from
|
| 108 |
> the target deployment site.
|
| 109 |
|
| 110 |
-
### OpenVINO Sample
|
| 111 |
-
|
| 112 |
-
The sample below uses the Ultralytics `model.track()` API with the PyTorch
|
| 113 |
-
weights to detect and track objects in a video, assigning persistent track IDs
|
| 114 |
-
via the built-in BoT-SORT tracker.
|
| 115 |
-
Each annotated frame -- with bounding boxes, track IDs, and per-track
|
| 116 |
-
trajectory polylines -- is written to `output.mp4`.
|
| 117 |
-
|
| 118 |
-
> **Important:** The `model.track()` API requires PyTorch weights (`.pt`).
|
| 119 |
-
> Using the OpenVINO model directory with `model.track()` produces zero
|
| 120 |
-
> detections in Ultralytics 8.4.x due to an incompatibility in the tracker
|
| 121 |
-
> integration. Use `model.predict()` for single-frame inference with the
|
| 122 |
-
> OpenVINO backend, or use the DLStreamer sample below for OpenVINO-accelerated
|
| 123 |
-
> tracking.
|
| 124 |
-
>
|
| 125 |
-
> The INT8 model (`yolo26n_tracking_int8.xml`) can be used directly with the
|
| 126 |
-
> OpenVINO Python API but not with the Ultralytics `YOLO()` wrapper.
|
| 127 |
-
|
| 128 |
-
```python
|
| 129 |
-
import subprocess
|
| 130 |
-
from collections import defaultdict
|
| 131 |
-
|
| 132 |
-
import cv2
|
| 133 |
-
import numpy as np
|
| 134 |
-
from ultralytics import YOLO
|
| 135 |
-
|
| 136 |
-
# Use PyTorch weights for tracking -- model.track() requires the .pt backend.
|
| 137 |
-
# The OpenVINO model directory works with model.predict() but not model.track().
|
| 138 |
-
model = YOLO("yolo26n.pt", task="detect")
|
| 139 |
-
|
| 140 |
-
video_path = "test_video.mp4"
|
| 141 |
-
cap = cv2.VideoCapture(video_path)
|
| 142 |
-
fps = cap.get(cv2.CAP_PROP_FPS) or 30.0
|
| 143 |
-
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
| 144 |
-
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
| 145 |
-
|
| 146 |
-
# Pipe frames to ffmpeg for H.264 output (universally playable).
|
| 147 |
-
proc = subprocess.Popen(
|
| 148 |
-
["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
|
| 149 |
-
"-s", f"{width}x{height}", "-r", str(fps),
|
| 150 |
-
"-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
|
| 151 |
-
"-movflags", "+faststart", "output.mp4"],
|
| 152 |
-
stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
|
| 153 |
-
)
|
| 154 |
-
|
| 155 |
-
# Distinct colors for trajectory lines (one per track ID).
|
| 156 |
-
COLORS = [
|
| 157 |
-
(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0),
|
| 158 |
-
(255, 0, 255), (0, 255, 255), (128, 0, 255), (255, 128, 0),
|
| 159 |
-
]
|
| 160 |
-
track_history: dict[int, list[tuple[float, float]]] = defaultdict(list)
|
| 161 |
-
|
| 162 |
-
while cap.isOpened():
|
| 163 |
-
success, frame = cap.read()
|
| 164 |
-
if not success:
|
| 165 |
-
break
|
| 166 |
-
|
| 167 |
-
# Run YOLO26 tracking with BoT-SORT (default).
|
| 168 |
-
# Use tracker="bytetrack.yaml" for ByteTrack alternative.
|
| 169 |
-
results = model.track(frame, persist=True, conf=0.4, tracker="botsort.yaml")
|
| 170 |
-
result = results[0]
|
| 171 |
-
|
| 172 |
-
annotated = result.plot()
|
| 173 |
-
|
| 174 |
-
if result.boxes and result.boxes.is_track:
|
| 175 |
-
boxes = result.boxes.xywh.cpu()
|
| 176 |
-
track_ids = result.boxes.id.int().cpu().tolist()
|
| 177 |
-
classes = result.boxes.cls.int().cpu().tolist()
|
| 178 |
-
|
| 179 |
-
for box, track_id in zip(boxes, track_ids):
|
| 180 |
-
x, y, _w, _h = box
|
| 181 |
-
track = track_history[track_id]
|
| 182 |
-
track.append((float(x), float(y)))
|
| 183 |
-
if len(track) > 30:
|
| 184 |
-
track.pop(0)
|
| 185 |
-
|
| 186 |
-
color = COLORS[track_id % len(COLORS)]
|
| 187 |
-
points = np.array(track, dtype=np.int32).reshape((-1, 1, 2))
|
| 188 |
-
cv2.polylines(annotated, [points], False, color, 2)
|
| 189 |
-
|
| 190 |
-
for tid, cls_id in zip(track_ids, classes):
|
| 191 |
-
cx, cy = track_history[tid][-1]
|
| 192 |
-
print(f" Track {tid}: class={cls_id} center=({cx:.0f},{cy:.0f})", flush=True)
|
| 193 |
-
|
| 194 |
-
proc.stdin.write(annotated.tobytes())
|
| 195 |
-
|
| 196 |
-
cap.release()
|
| 197 |
-
proc.stdin.close()
|
| 198 |
-
proc.wait()
|
| 199 |
-
print("Wrote output.mp4", flush=True)
|
| 200 |
-
```
|
| 201 |
-
|
| 202 |
-
**Device targets:**
|
| 203 |
-
|
| 204 |
-
- Default runs on CPU via OpenVINO.
|
| 205 |
-
- For GPU: set `device="gpu:0"` in the `model.track()` call.
|
| 206 |
-
- For NPU: set `device="npu:0"` (validate availability with `benchmark_app -d NPU`).
|
| 207 |
-
|
| 208 |
-
### Try It on a Sample Video
|
| 209 |
-
|
| 210 |
-
The `export_and_quantize.sh` script downloads `test_video.mp4` automatically.
|
| 211 |
-
Run the OpenVINO sample above.
|
| 212 |
-
The script processes each frame, prints per-track positions to the console,
|
| 213 |
-
and writes the annotated video to `output.mp4`.
|
| 214 |
-
|
| 215 |
-
Expected console output (representative):
|
| 216 |
-
|
| 217 |
-
```text
|
| 218 |
-
Track 1: class=0 center=(320,240)
|
| 219 |
-
Track 2: class=0 center=(450,300)
|
| 220 |
-
```
|
| 221 |
-
|
| 222 |
-
`output.mp4` shows bounding boxes with track IDs and colored trajectory
|
| 223 |
-
polylines for each tracked object.
|
| 224 |
-
|
| 225 |
### DLStreamer Sample
|
| 226 |
|
| 227 |
The pipeline below runs the YOLO26 FP16 detector via `gvadetect` on
|
| 228 |
`test_video.mp4`, attaches persistent track IDs with `gvatrack`
|
| 229 |
(`short-term-imageless` tracker), and overlays bounding boxes with
|
| 230 |
`gvawatermark`. Frames are pulled from an `appsink`, per-track trajectory
|
| 231 |
-
polylines are drawn with OpenCV, and the result is muxed to `
|
| 232 |
(H.264 via ffmpeg).
|
| 233 |
|
| 234 |
> **Notes on running this sample:**
|
|
@@ -259,15 +141,14 @@ from gstgva import VideoFrame
|
|
| 259 |
|
| 260 |
Gst.init(None)
|
| 261 |
|
| 262 |
-
# For
|
| 263 |
-
#
|
| 264 |
-
# pre-process-backend=vaapi-surface-sharing on gvadetect.
|
| 265 |
-
# For NPU: change device=CPU to device=NPU (batch-size=1, nireq=4 recommended).
|
| 266 |
pipeline_str = (
|
| 267 |
-
"filesrc location=test_video.mp4 ! decodebin3 !
|
| 268 |
-
"
|
| 269 |
"gvadetect model=yolo26n_openvino_model/yolo26n.xml "
|
| 270 |
-
"device=
|
|
|
|
| 271 |
"gvatrack tracking-type=short-term-imageless ! queue ! "
|
| 272 |
"gvawatermark ! appsink name=sink emit-signals=false sync=false"
|
| 273 |
)
|
|
@@ -304,7 +185,7 @@ while True:
|
|
| 304 |
["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
|
| 305 |
"-s", f"{width}x{height}", "-r", str(fps),
|
| 306 |
"-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
|
| 307 |
-
"-movflags", "+faststart", "
|
| 308 |
stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
|
| 309 |
)
|
| 310 |
|
|
@@ -344,14 +225,18 @@ pipeline.set_state(Gst.State.NULL)
|
|
| 344 |
if proc:
|
| 345 |
proc.stdin.close()
|
| 346 |
proc.wait()
|
| 347 |
-
print("Wrote
|
| 348 |
```
|
| 349 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 350 |
**Device targets:**
|
| 351 |
|
| 352 |
-
- `device=
|
| 353 |
-
- `device=
|
| 354 |
-
- `device=NPU` -- use `batch-size=1` and `nireq=4` for best NPU utilization.
|
| 355 |
|
| 356 |
---
|
| 357 |
|
|
|
|
| 1 |
# Motion Tracking
|
| 2 |
|
|
|
|
|
|
|
| 3 |
| Property | Value |
|
| 4 |
|---|---|
|
| 5 |
| **Category** | Object Detection + Multi-Object Tracking |
|
|
|
|
| 17 |
Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection.
|
| 18 |
It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:
|
| 19 |
|
|
|
|
| 20 |
- **DLStreamer pipeline:** YOLO26 FP16 detection via `gvadetect` + `gvatrack` element with `tracking-type=short-term-imageless`.
|
| 21 |
|
| 22 |
Each detected object receives a unique `track_id` that persists across frames as long as the object remains visible.
|
|
|
|
| 39 |
## Prerequisites
|
| 40 |
|
| 41 |
- Python 3.11+
|
| 42 |
+
- `ffmpeg` (`sudo apt install ffmpeg`) -- used by both samples to encode output video
|
| 43 |
- [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
|
| 44 |
- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)
|
| 45 |
|
|
|
|
| 83 |
The script performs the following steps:
|
| 84 |
|
| 85 |
1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
|
| 86 |
+
2. Downloads the sample test video (`test_video.mp4`) and a sample test image (`test.jpg`).
|
| 87 |
3. Downloads the PyTorch weights and exports to OpenVINO IR.
|
| 88 |
4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.
|
| 89 |
|
|
|
|
| 104 |
> For production accuracy, replace it with a representative set of frames from
|
| 105 |
> the target deployment site.
|
| 106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
### DLStreamer Sample
|
| 108 |
|
| 109 |
The pipeline below runs the YOLO26 FP16 detector via `gvadetect` on
|
| 110 |
`test_video.mp4`, attaches persistent track IDs with `gvatrack`
|
| 111 |
(`short-term-imageless` tracker), and overlays bounding boxes with
|
| 112 |
`gvawatermark`. Frames are pulled from an `appsink`, per-track trajectory
|
| 113 |
+
polylines are drawn with OpenCV, and the result is muxed to `output_dlstreamer.mp4`
|
| 114 |
(H.264 via ffmpeg).
|
| 115 |
|
| 116 |
> **Notes on running this sample:**
|
|
|
|
| 141 |
|
| 142 |
Gst.init(None)
|
| 143 |
|
| 144 |
+
# For CPU: change device=GPU to device=CPU.
|
| 145 |
+
# For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
|
|
|
|
|
|
|
| 146 |
pipeline_str = (
|
| 147 |
+
"filesrc location=test_video.mp4 ! decodebin3 ! "
|
| 148 |
+
"videoconvert ! "
|
| 149 |
"gvadetect model=yolo26n_openvino_model/yolo26n.xml "
|
| 150 |
+
"device=GPU "
|
| 151 |
+
"threshold=0.4 ! queue ! "
|
| 152 |
"gvatrack tracking-type=short-term-imageless ! queue ! "
|
| 153 |
"gvawatermark ! appsink name=sink emit-signals=false sync=false"
|
| 154 |
)
|
|
|
|
| 185 |
["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
|
| 186 |
"-s", f"{width}x{height}", "-r", str(fps),
|
| 187 |
"-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
|
| 188 |
+
"-movflags", "+faststart", "output_dlstreamer.mp4"],
|
| 189 |
stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
|
| 190 |
)
|
| 191 |
|
|
|
|
| 225 |
if proc:
|
| 226 |
proc.stdin.close()
|
| 227 |
proc.wait()
|
| 228 |
+
print("Wrote output_dlstreamer.mp4", flush=True)
|
| 229 |
```
|
| 230 |
|
| 231 |
+
#### Expected Output
|
| 232 |
+
|
| 233 |
+

|
| 234 |
+
|
| 235 |
**Device targets:**
|
| 236 |
|
| 237 |
+
- `device=GPU` -- default in the sample code.
|
| 238 |
+
- `device=CPU` -- change `device=GPU` to `device=CPU`.
|
| 239 |
+
- `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization.
|
| 240 |
|
| 241 |
---
|
| 242 |
|
expected_output_dlstreamer.gif
ADDED
|
|
Git LFS Details
|
expected_output_openvino.gif
ADDED
|
|
Git LFS Details
|
export_and_quantize.sh
CHANGED
|
@@ -47,6 +47,14 @@ else
|
|
| 47 |
echo "Already present: test_video.mp4"
|
| 48 |
fi
|
| 49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
if [[ "${PRECISION}" == "FP32" ]]; then
|
| 51 |
HALF_FLAG="False"
|
| 52 |
EXPORT_LABEL="FP32"
|
|
|
|
| 47 |
echo "Already present: test_video.mp4"
|
| 48 |
fi
|
| 49 |
|
| 50 |
+
echo "--- Downloading sample test image ---"
|
| 51 |
+
if [[ ! -f test.jpg ]]; then
|
| 52 |
+
wget -q -O test.jpg https://ultralytics.com/images/bus.jpg
|
| 53 |
+
echo "Downloaded: test.jpg"
|
| 54 |
+
else
|
| 55 |
+
echo "Already present: test.jpg"
|
| 56 |
+
fi
|
| 57 |
+
|
| 58 |
if [[ "${PRECISION}" == "FP32" ]]; then
|
| 59 |
HALF_FLAG="False"
|
| 60 |
EXPORT_LABEL="FP32"
|