File size: 10,596 Bytes
a2753c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0052bee
e07471b
 
 
 
 
f21564b
e07471b
 
 
 
 
 
 
 
 
0052bee
e07471b
 
 
 
 
 
 
 
 
0052bee
 
e07471b
 
 
 
 
0052bee
 
 
 
 
 
 
 
 
e07471b
f21564b
 
 
 
e07471b
 
 
 
 
 
0052bee
e07471b
 
 
72ef152
 
 
 
 
 
 
 
0052bee
 
72ef152
e07471b
 
0052bee
 
e07471b
 
 
0052bee
 
 
 
e07471b
 
 
0052bee
 
 
 
 
 
 
 
 
 
e07471b
72ef152
 
 
e07471b
 
 
0052bee
 
 
 
 
 
e07471b
0052bee
 
72ef152
e07471b
 
72ef152
 
 
 
 
e07471b
 
 
 
 
0052bee
e07471b
 
0052bee
 
 
 
e07471b
 
0052bee
 
e07471b
 
 
 
 
 
 
 
 
 
 
 
0052bee
 
 
 
e07471b
 
0052bee
e90babb
0052bee
e90babb
0052bee
e07471b
0052bee
 
 
f21564b
e90babb
e07471b
 
 
72ef152
e07471b
 
 
 
 
 
 
 
 
 
 
 
 
 
0052bee
e07471b
 
 
 
 
 
 
 
 
 
 
0052bee
 
e07471b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72ef152
 
 
e07471b
 
72ef152
 
 
 
e07471b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f21564b
 
 
e07471b
 
f21564b
 
 
 
e07471b
 
f21564b
 
e90babb
0052bee
e07471b
f21564b
 
 
 
 
 
e90babb
 
 
 
 
 
 
 
 
 
e07471b
 
 
 
 
 
 
 
 
0052bee
 
 
e07471b
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
---
license: other
license_name: intel-custom
license_link: LICENSE
library_name: openvino
pipeline_tag: object-detection
tags:
  - openvino
  - intel
  - yolo
  - yolo26
  - loitering-detection
  - zone-analytics
  - tracking
  - edge-ai
  - metro
  - dlstreamer
datasets:
  - detection-datasets/coco
language:
  - en
---

# Loitering Detection

| Property | Value |
|---|---|
| **Category** | Object Detection + Tracking + Zone Analytics |
| **Source Framework** | PyTorch (Ultralytics) |
| **Supported Precisions** | FP32, FP16, INT8 (mixed-precision) |
| **Inference Engine** | OpenVINO |
| **Hardware** | CPU, GPU, NPU |
| **Detected Class** | `person` (COCO class 0) |

---

## Overview

Loitering Detection is a Metro Analytics use case that flags people who remain inside a configurable region of interest for longer than a dwell-time threshold.
It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/) for person detection, paired with a multi-object tracker that assigns persistent IDs across frames.
A polygon zone defines the area to monitor; for each tracked person whose bounding-box anchor falls inside the zone, the application accumulates dwell time and raises a loitering event when the threshold is exceeded.

Typical Metro deployments include:

- **Restricted-Area Monitoring** -- raise alerts when a person lingers near tracks, equipment rooms, or after-hours zones.
- **Platform Edge Safety** -- detect prolonged presence inside a yellow-line buffer.
- **ATM and Ticketing Security** -- identify suspicious dwell at unattended kiosks.
- **Crowd-Free Zone Enforcement** -- monitor emergency exits and corridors that must remain clear.

Available variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`.
Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment.

---

## Prerequisites

- Python 3.11+
- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html)

Create and activate a Python virtual environment before running the scripts:

```bash
python3 -m venv .venv --system-site-packages
source .venv/bin/activate
```

> **Note:** The `--system-site-packages` flag is required so the virtual
> environment can access the system-installed OpenVINO and DLStreamer Python
> packages.

---

## Getting Started

### Download and Quantize Model

Run the provided script to download, export to OpenVINO IR, and optionally quantize:

```bash
chmod +x export_and_quantize.sh
./export_and_quantize.sh
```

This exports the default **yolo26n** model in **FP16** precision.

#### Optional: Select a Different Variant or Precision

```bash
./export_and_quantize.sh yolo26n FP32   # full-precision
./export_and_quantize.sh yolo26n INT8   # quantized
./export_and_quantize.sh yolo26s        # larger variant, default FP16
```

Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`).
The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is **FP16**.

The script performs the following steps:

1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
2. Downloads the sample surveillance video (`VIRAT_S_000101.mp4`) from the Intel Metro AI Suite project into the current directory.
3. Downloads the PyTorch weights and exports to OpenVINO IR.
4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.

Output files:

- `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory.
- `yolo26n_loitering_int8.xml` / `yolo26n_loitering_int8.bin` -- INT8 quantized model *(only when `INT8` is selected)*.

#### Precision / Device Compatibility

| Precision | CPU | GPU | NPU |
|---|---|---|---|
| FP32 | Yes | Yes | No |
| FP16 | Yes | Yes | Yes |
| INT8 | Yes | Yes | Yes |

> **Note:** The INT8 calibration uses frames from the bundled sample video.
> For production accuracy, replace it with a representative set of frames from
> the target deployment site.

### Defining the Region of Interest

The zone is a rectangular ROI expressed as `x_min,y_min,x_max,y_max` in the
original input frame coordinates (not the 640x640 model input).
DLStreamer's `gvaattachroi` element attaches the ROI to every buffer, and
`gvadetect inference-region=1` (`roi-list`) restricts inference to that ROI
only -- no Python polygon math required.
A typical surveillance-zone configuration on a 1280x720 source might be:

```text
roi=400,200,1100,650          # ROI for gvaattachroi (x_min,y_min,x_max,y_max)
LOITERING_SECONDS = 5.0       # dwell threshold, in seconds (demo value)
```

> **Note:** The sample uses a 5-second threshold so that loitering events are
> triggered quickly on the short demo video.  For production deployments,
> increase this to 10--30 seconds depending on the site's operational
> requirements.

Per-person dwell time is measured at the bottom-center of the bounding box
(the foot anchor), which most closely approximates the person's ground position.

### DLStreamer Sample

- The DLStreamer Python module is not on `sys.path` by default. Export `PYTHONPATH` before running:

```bash
source /opt/intel/openvino_2026/setupvars.sh
source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
export PYTHONPATH=/opt/intel/dlstreamer/python:\
/opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
```

**Video-based loitering detection** (requires video for dwell-time tracking):

```python
from collections import defaultdict

import gi

gi.require_version("Gst", "1.0")
gi.require_version("GstVideo", "1.0")
from gi.repository import Gst
from gstgva import VideoFrame

Gst.init(None)

MODEL_XML = "yolo26n_openvino_model/yolo26n.xml"
INPUT_VIDEO = "VIRAT_S_000101.mp4"
ROI = "0,200,300,400"  # x_min,y_min,x_max,y_max
LOITERING_SECONDS = 5.0

pipeline_str = (
    f"filesrc location={INPUT_VIDEO} ! decodebin3 ! "
    f"videoconvert ! "
    f"gvaattachroi roi={ROI} ! "
    f"gvadetect inference-region=1 model={MODEL_XML} device=GPU "
    f"threshold=0.5 ! queue ! "
    f"gvatrack tracking-type=short-term-imageless ! queue ! "
    f"gvametaconvert add-empty-results=true ! queue ! "
    f"gvafpscounter ! "
    f"gvawatermark ! videoconvert ! video/x-raw,format=I420 ! "
    f"openh264enc ! h264parse ! "
    f"mp4mux ! filesink name=sink location=output_dlstreamer.mp4"
)
pipeline = Gst.parse_launch(pipeline_str)

STALE_TIMEOUT = 2.0  # seconds of absence before clearing dwell state
dwell_state: dict[int, float] = defaultdict(float)
last_seen: dict[int, float] = {}
flagged: set[int] = set()


def on_buffer(pad, info):
    buf = info.get_buffer()
    caps = pad.get_current_caps()
    frame = VideoFrame(buf, caps=caps)

    now = buf.pts / Gst.SECOND if buf.pts != Gst.CLOCK_TIME_NONE else 0.0
    seen_ids: set[int] = set()

    for region in frame.regions():
        # gvaattachroi attaches a frame-level ROI region; skip it.
        if region.label() != "person":
            continue
        object_id = region.object_id()
        if object_id <= 0:
            continue

        rect = region.rect()
        foot_x = int(rect.x + rect.w / 2)
        foot_y = int(rect.y + rect.h)
        seen_ids.add(object_id)

        # gvadetect inference-region=1 already constrains detections to the
        # gvaattachroi zone, so every tracked person here is "in zone".
        prev = last_seen.get(object_id, now)
        dwell_state[object_id] += now - prev
        last_seen[object_id] = now

        if (
            dwell_state[object_id] >= LOITERING_SECONDS
            and object_id not in flagged
        ):
            flagged.add(object_id)
            print(
                f"LOITERING id={object_id} "
                f"dwell={dwell_state[object_id]:.1f}s "
                f"anchor=({foot_x},{foot_y})",
                flush=True,
            )

    # Clean up stale tracks after STALE_TIMEOUT seconds of absence.
    # Keep flagged entries to prevent duplicate alerts when a person
    # briefly disappears (occlusion / tracker jitter) and reappears.
    for stale in list(dwell_state):
        if stale not in seen_ids:
            elapsed_since = now - last_seen.get(stale, now)
            if elapsed_since > STALE_TIMEOUT:
                dwell_state.pop(stale, None)
                last_seen.pop(stale, None)

    return Gst.PadProbeReturn.OK


sink = pipeline.get_by_name("sink")
sink_pad = sink.get_static_pad("sink")
sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)

pipeline.set_state(Gst.State.PLAYING)
bus = pipeline.get_bus()
bus.timed_pop_filtered(
    Gst.CLOCK_TIME_NONE,
    Gst.MessageType.EOS | Gst.MessageType.ERROR,
)
pipeline.set_state(Gst.State.NULL)
```

Expected output with the sample video and the zone/threshold above
(exact track IDs and anchor coordinates may vary between runs due to
tracker non-determinism):

```text
LOITERING id=26 dwell=5.0s anchor=(147,341)
LOITERING id=27 dwell=5.0s anchor=(122,337)
LOITERING id=29 dwell=5.0s anchor=(90,322)
...
```

Approximately 10–12 loitering events are expected over the full video.

The annotated video is saved to `output_dlstreamer.mp4` with green bounding boxes and
track IDs drawn by `gvawatermark` around every detected person.

> **Known warning:** The `openh264enc` element prints
> `[OpenH264] this = 0x..., Error:CWelsH264SVCEncoder::EncodeFrame(), cmInitParaError.`
> on the first frame. This is a benign initialization message — the output
> video is encoded correctly. The warning comes from the OpenH264 library's
> internal logging and does not indicate a real error.

#### Expected Output

![DLStreamer expected output](expected_output_dlstreamer.gif)

**Device targets:**

- `device=GPU` -- default in the sample code.
- `device=CPU` -- change `device=GPU` to `device=CPU`.
- `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization.

---

## License

Copyright (C) Intel Corporation. All rights reserved.
Licensed under the MIT License. See [LICENSE](LICENSE) for details.

## References

- [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/)
- [OpenVINO YOLO26 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov26-optimization/yolov26-object-detection.ipynb)
- [Intel DLStreamer Object Tracking](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/elements/gvatrack.html)
- [OpenVINO Documentation](https://docs.openvino.ai/)
- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
- [COCO Dataset](https://cocodataset.org/)