File size: 10,596 Bytes
a2753c3 0052bee e07471b f21564b e07471b 0052bee e07471b 0052bee e07471b 0052bee e07471b f21564b e07471b 0052bee e07471b 72ef152 0052bee 72ef152 e07471b 0052bee e07471b 0052bee e07471b 0052bee e07471b 72ef152 e07471b 0052bee e07471b 0052bee 72ef152 e07471b 72ef152 e07471b 0052bee e07471b 0052bee e07471b 0052bee e07471b 0052bee e07471b 0052bee e90babb 0052bee e90babb 0052bee e07471b 0052bee f21564b e90babb e07471b 72ef152 e07471b 0052bee e07471b 0052bee e07471b 72ef152 e07471b 72ef152 e07471b f21564b e07471b f21564b e07471b f21564b e90babb 0052bee e07471b f21564b e90babb e07471b 0052bee e07471b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 | ---
license: other
license_name: intel-custom
license_link: LICENSE
library_name: openvino
pipeline_tag: object-detection
tags:
- openvino
- intel
- yolo
- yolo26
- loitering-detection
- zone-analytics
- tracking
- edge-ai
- metro
- dlstreamer
datasets:
- detection-datasets/coco
language:
- en
---
# Loitering Detection
| Property | Value |
|---|---|
| **Category** | Object Detection + Tracking + Zone Analytics |
| **Source Framework** | PyTorch (Ultralytics) |
| **Supported Precisions** | FP32, FP16, INT8 (mixed-precision) |
| **Inference Engine** | OpenVINO |
| **Hardware** | CPU, GPU, NPU |
| **Detected Class** | `person` (COCO class 0) |
---
## Overview
Loitering Detection is a Metro Analytics use case that flags people who remain inside a configurable region of interest for longer than a dwell-time threshold.
It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/) for person detection, paired with a multi-object tracker that assigns persistent IDs across frames.
A polygon zone defines the area to monitor; for each tracked person whose bounding-box anchor falls inside the zone, the application accumulates dwell time and raises a loitering event when the threshold is exceeded.
Typical Metro deployments include:
- **Restricted-Area Monitoring** -- raise alerts when a person lingers near tracks, equipment rooms, or after-hours zones.
- **Platform Edge Safety** -- detect prolonged presence inside a yellow-line buffer.
- **ATM and Ticketing Security** -- identify suspicious dwell at unattended kiosks.
- **Crowd-Free Zone Enforcement** -- monitor emergency exits and corridors that must remain clear.
Available variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`.
Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment.
---
## Prerequisites
- Python 3.11+
- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html)
Create and activate a Python virtual environment before running the scripts:
```bash
python3 -m venv .venv --system-site-packages
source .venv/bin/activate
```
> **Note:** The `--system-site-packages` flag is required so the virtual
> environment can access the system-installed OpenVINO and DLStreamer Python
> packages.
---
## Getting Started
### Download and Quantize Model
Run the provided script to download, export to OpenVINO IR, and optionally quantize:
```bash
chmod +x export_and_quantize.sh
./export_and_quantize.sh
```
This exports the default **yolo26n** model in **FP16** precision.
#### Optional: Select a Different Variant or Precision
```bash
./export_and_quantize.sh yolo26n FP32 # full-precision
./export_and_quantize.sh yolo26n INT8 # quantized
./export_and_quantize.sh yolo26s # larger variant, default FP16
```
Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`).
The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is **FP16**.
The script performs the following steps:
1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
2. Downloads the sample surveillance video (`VIRAT_S_000101.mp4`) from the Intel Metro AI Suite project into the current directory.
3. Downloads the PyTorch weights and exports to OpenVINO IR.
4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.
Output files:
- `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory.
- `yolo26n_loitering_int8.xml` / `yolo26n_loitering_int8.bin` -- INT8 quantized model *(only when `INT8` is selected)*.
#### Precision / Device Compatibility
| Precision | CPU | GPU | NPU |
|---|---|---|---|
| FP32 | Yes | Yes | No |
| FP16 | Yes | Yes | Yes |
| INT8 | Yes | Yes | Yes |
> **Note:** The INT8 calibration uses frames from the bundled sample video.
> For production accuracy, replace it with a representative set of frames from
> the target deployment site.
### Defining the Region of Interest
The zone is a rectangular ROI expressed as `x_min,y_min,x_max,y_max` in the
original input frame coordinates (not the 640x640 model input).
DLStreamer's `gvaattachroi` element attaches the ROI to every buffer, and
`gvadetect inference-region=1` (`roi-list`) restricts inference to that ROI
only -- no Python polygon math required.
A typical surveillance-zone configuration on a 1280x720 source might be:
```text
roi=400,200,1100,650 # ROI for gvaattachroi (x_min,y_min,x_max,y_max)
LOITERING_SECONDS = 5.0 # dwell threshold, in seconds (demo value)
```
> **Note:** The sample uses a 5-second threshold so that loitering events are
> triggered quickly on the short demo video. For production deployments,
> increase this to 10--30 seconds depending on the site's operational
> requirements.
Per-person dwell time is measured at the bottom-center of the bounding box
(the foot anchor), which most closely approximates the person's ground position.
### DLStreamer Sample
- The DLStreamer Python module is not on `sys.path` by default. Export `PYTHONPATH` before running:
```bash
source /opt/intel/openvino_2026/setupvars.sh
source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
export PYTHONPATH=/opt/intel/dlstreamer/python:\
/opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
```
**Video-based loitering detection** (requires video for dwell-time tracking):
```python
from collections import defaultdict
import gi
gi.require_version("Gst", "1.0")
gi.require_version("GstVideo", "1.0")
from gi.repository import Gst
from gstgva import VideoFrame
Gst.init(None)
MODEL_XML = "yolo26n_openvino_model/yolo26n.xml"
INPUT_VIDEO = "VIRAT_S_000101.mp4"
ROI = "0,200,300,400" # x_min,y_min,x_max,y_max
LOITERING_SECONDS = 5.0
pipeline_str = (
f"filesrc location={INPUT_VIDEO} ! decodebin3 ! "
f"videoconvert ! "
f"gvaattachroi roi={ROI} ! "
f"gvadetect inference-region=1 model={MODEL_XML} device=GPU "
f"threshold=0.5 ! queue ! "
f"gvatrack tracking-type=short-term-imageless ! queue ! "
f"gvametaconvert add-empty-results=true ! queue ! "
f"gvafpscounter ! "
f"gvawatermark ! videoconvert ! video/x-raw,format=I420 ! "
f"openh264enc ! h264parse ! "
f"mp4mux ! filesink name=sink location=output_dlstreamer.mp4"
)
pipeline = Gst.parse_launch(pipeline_str)
STALE_TIMEOUT = 2.0 # seconds of absence before clearing dwell state
dwell_state: dict[int, float] = defaultdict(float)
last_seen: dict[int, float] = {}
flagged: set[int] = set()
def on_buffer(pad, info):
buf = info.get_buffer()
caps = pad.get_current_caps()
frame = VideoFrame(buf, caps=caps)
now = buf.pts / Gst.SECOND if buf.pts != Gst.CLOCK_TIME_NONE else 0.0
seen_ids: set[int] = set()
for region in frame.regions():
# gvaattachroi attaches a frame-level ROI region; skip it.
if region.label() != "person":
continue
object_id = region.object_id()
if object_id <= 0:
continue
rect = region.rect()
foot_x = int(rect.x + rect.w / 2)
foot_y = int(rect.y + rect.h)
seen_ids.add(object_id)
# gvadetect inference-region=1 already constrains detections to the
# gvaattachroi zone, so every tracked person here is "in zone".
prev = last_seen.get(object_id, now)
dwell_state[object_id] += now - prev
last_seen[object_id] = now
if (
dwell_state[object_id] >= LOITERING_SECONDS
and object_id not in flagged
):
flagged.add(object_id)
print(
f"LOITERING id={object_id} "
f"dwell={dwell_state[object_id]:.1f}s "
f"anchor=({foot_x},{foot_y})",
flush=True,
)
# Clean up stale tracks after STALE_TIMEOUT seconds of absence.
# Keep flagged entries to prevent duplicate alerts when a person
# briefly disappears (occlusion / tracker jitter) and reappears.
for stale in list(dwell_state):
if stale not in seen_ids:
elapsed_since = now - last_seen.get(stale, now)
if elapsed_since > STALE_TIMEOUT:
dwell_state.pop(stale, None)
last_seen.pop(stale, None)
return Gst.PadProbeReturn.OK
sink = pipeline.get_by_name("sink")
sink_pad = sink.get_static_pad("sink")
sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)
pipeline.set_state(Gst.State.PLAYING)
bus = pipeline.get_bus()
bus.timed_pop_filtered(
Gst.CLOCK_TIME_NONE,
Gst.MessageType.EOS | Gst.MessageType.ERROR,
)
pipeline.set_state(Gst.State.NULL)
```
Expected output with the sample video and the zone/threshold above
(exact track IDs and anchor coordinates may vary between runs due to
tracker non-determinism):
```text
LOITERING id=26 dwell=5.0s anchor=(147,341)
LOITERING id=27 dwell=5.0s anchor=(122,337)
LOITERING id=29 dwell=5.0s anchor=(90,322)
...
```
Approximately 10–12 loitering events are expected over the full video.
The annotated video is saved to `output_dlstreamer.mp4` with green bounding boxes and
track IDs drawn by `gvawatermark` around every detected person.
> **Known warning:** The `openh264enc` element prints
> `[OpenH264] this = 0x..., Error:CWelsH264SVCEncoder::EncodeFrame(), cmInitParaError.`
> on the first frame. This is a benign initialization message — the output
> video is encoded correctly. The warning comes from the OpenH264 library's
> internal logging and does not indicate a real error.
#### Expected Output

**Device targets:**
- `device=GPU` -- default in the sample code.
- `device=CPU` -- change `device=GPU` to `device=CPU`.
- `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization.
---
## License
Copyright (C) Intel Corporation. All rights reserved.
Licensed under the MIT License. See [LICENSE](LICENSE) for details.
## References
- [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/)
- [OpenVINO YOLO26 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov26-optimization/yolov26-object-detection.ipynb)
- [Intel DLStreamer Object Tracking](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/elements/gvatrack.html)
- [OpenVINO Documentation](https://docs.openvino.ai/)
- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
- [COCO Dataset](https://cocodataset.org/)
|