vagheshpatel commited on
Commit
e07471b
·
verified ·
1 Parent(s): 8aff2a0

Sync loitering-detection from metro-analytics-catalog

Browse files
Files changed (3) hide show
  1. LICENSE +45 -0
  2. README.md +294 -5
  3. export_and_quantize.sh +53 -0
LICENSE CHANGED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This directory contains two categories of content under different licenses.
2
+
3
+
4
+ Scripts and Documentation
5
+ -------------------------
6
+
7
+ The scripts (export_and_quantize.sh) and documentation (README.md) in this
8
+ directory are original works by Intel Corporation, licensed under the
9
+ MIT License.
10
+
11
+ Copyright (C) Intel Corporation
12
+
13
+ Permission is hereby granted, free of charge, to any person obtaining a copy
14
+ of this software and associated documentation files (the "Software"), to deal
15
+ in the Software without restriction, including without limitation the rights
16
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
17
+ copies of the Software, and to permit persons to whom the Software is
18
+ furnished to do so, subject to the following conditions:
19
+
20
+ The above copyright notice and this permission notice shall be included in
21
+ all copies or substantial portions of the Software.
22
+
23
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
24
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
25
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
26
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
27
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
28
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
29
+ THE SOFTWARE.
30
+
31
+
32
+ YOLO11 Model
33
+ ------------
34
+
35
+ The YOLO11 model weights and the Ultralytics framework are developed by
36
+ Ultralytics and licensed under the GNU Affero General Public License v3.0
37
+ (AGPL-3.0).
38
+
39
+ Source: https://github.com/ultralytics/ultralytics
40
+ License: https://github.com/ultralytics/ultralytics/blob/main/LICENSE
41
+ Docs: https://docs.ultralytics.com/models/yolo11/
42
+
43
+ Users must comply with the AGPL-3.0 license terms when using, modifying,
44
+ or distributing the YOLO11 model weights or Ultralytics software.
45
+ For commercial licensing options, see https://www.ultralytics.com/license.
README.md CHANGED
@@ -1,5 +1,294 @@
1
- ---
2
- license: other
3
- license_name: other
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Loitering Detection -- Zone-Based Dwell Time on Intel Hardware
2
+
3
+ > **Reference notebook:** [yolov11-object-detection.ipynb](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov11-optimization/yolov11-object-detection.ipynb)
4
+ >
5
+ > **Validated with:** OpenVINO 2026.0.0, NNCF 3.0.0, Ultralytics 8.3.0, Python 3.11+
6
+
7
+ | Property | Value |
8
+ |---|---|
9
+ | **Category** | Object Detection + Tracking + Zone Analytics |
10
+ | **Source Framework** | PyTorch (Ultralytics) |
11
+ | **Supported Precisions** | FP16, FP16-INT8 |
12
+ | **Inference Engine** | OpenVINO |
13
+ | **Hardware** | CPU, GPU, NPU |
14
+ | **Detected Class** | `person` (COCO class 0) |
15
+
16
+ ---
17
+
18
+ ## Overview
19
+
20
+ Loitering Detection is a Metro Analytics use case that flags people who remain inside a configurable region of interest for longer than a dwell-time threshold.
21
+ It is built on [YOLO11](https://docs.ultralytics.com/models/yolo11/) for person detection, paired with a multi-object tracker that assigns persistent IDs across frames.
22
+ A polygon zone defines the area to monitor; for each tracked person whose bounding-box anchor falls inside the zone, the application accumulates dwell time and raises a loitering event when the threshold is exceeded.
23
+
24
+ Typical Metro deployments include:
25
+
26
+ - **Restricted-Area Monitoring** -- raise alerts when a person lingers near tracks, equipment rooms, or after-hours zones.
27
+ - **Platform Edge Safety** -- detect prolonged presence inside a yellow-line buffer.
28
+ - **ATM and Ticketing Security** -- identify suspicious dwell at unattended kiosks.
29
+ - **Crowd-Free Zone Enforcement** -- monitor emergency exits and corridors that must remain clear.
30
+
31
+ Available variants: `yolo11n`, `yolo11s`, `yolo11m`, `yolo11l`, `yolo11x`.
32
+ Smaller variants (`yolo11n`, `yolo11s`) are recommended for high-FPS edge deployment.
33
+
34
+ ---
35
+
36
+ ## Prerequisites
37
+
38
+ - [Install OpenVINO 2026.0.0](https://docs.openvino.ai/2026/get-started/install-openvino.html)
39
+ - [Install Intel DLStreamer](https://dlstreamer.github.io/get_started/install/install-guide-ubuntu.html)
40
+
41
+ ---
42
+
43
+ ## Getting Started
44
+
45
+ ### Download and Quantize Model
46
+
47
+ Run the provided script to download, export to OpenVINO IR (FP16), and quantize to INT8:
48
+
49
+ ```bash
50
+ chmod +x export_and_quantize.sh
51
+ ./export_and_quantize.sh yolo11n
52
+ ```
53
+
54
+ Replace `yolo11n` with any variant (`yolo11s`, `yolo11m`, `yolo11l`, `yolo11x`).
55
+
56
+ The script performs the following steps:
57
+
58
+ 1. Installs dependencies (`openvino`, `nncf`, `ultralytics`).
59
+ 2. Downloads the PyTorch weights and exports to OpenVINO IR with `half=True`.
60
+ 3. Quantizes the model to INT8 using NNCF post-training quantization.
61
+ 4. Runs `benchmark_app` to validate throughput.
62
+
63
+ Output files:
64
+
65
+ - `yolo11n_openvino_model/` -- FP16 OpenVINO IR model directory.
66
+ - `yolo11n_loitering_int8.xml` / `yolo11n_loitering_int8.bin` -- INT8 quantized model.
67
+
68
+ > **Note:** For production accuracy, replace the random calibration tensors in
69
+ > `export_and_quantize.sh` with a representative sample of frames from the
70
+ > target deployment site.
71
+
72
+ ### Defining the Region of Interest
73
+
74
+ The zone is a list of pixel-space `(x, y)` polygon vertices in clockwise order,
75
+ expressed in the original input frame coordinates (not the 640x640 model input).
76
+ A typical platform-edge zone might be:
77
+
78
+ ```python
79
+ ZONE_POLYGON = [(420, 380), (1500, 380), (1500, 540), (420, 540)]
80
+ LOITERING_SECONDS = 10.0
81
+ ```
82
+
83
+ Per-person dwell time is measured at the bottom-center of the bounding box
84
+ (the foot anchor), which most closely approximates the person's ground position.
85
+
86
+ ### DLStreamer Sample
87
+
88
+ The sample below runs the YOLO11 detector via `gvadetect`, attaches persistent
89
+ track IDs with `gvatrack`, and uses the DLStreamer Python bindings
90
+ (`gstgva.VideoFrame`) to filter `person` regions, test whether each tracked
91
+ person's foot anchor lies inside the zone polygon, accumulate dwell time per
92
+ `object_id`, and print a loitering event when the threshold is exceeded.
93
+
94
+ > **Notes on running this sample:**
95
+ >
96
+ > - Use the FP16 IR (`yolo11n_openvino_model/yolo11n.xml`).
97
+ > On DLStreamer 2026.0.0, `gvadetect` cannot auto-derive a YOLO post-processor
98
+ > from the INT8 model produced by the bundled script (the quantize/dequantize
99
+ > layers shift the output node names away from the names the auto-postproc
100
+ > expects).
101
+ > To use the INT8 model, supply a matching `model-proc` JSON.
102
+ > - `gvadetect` requires `labels-file=` to map class indices to names. The
103
+ > sample creates a `coco.txt` next to the script.
104
+ > - Filtering with `object-class=person` directly on `gvadetect` is rejected
105
+ > when `inference-region` is `full-frame` (the default), so the sample
106
+ > filters by `region.label()` in the buffer probe instead.
107
+ > - The DLStreamer Python module is not on `sys.path` by default. Export
108
+ > `PYTHONPATH` before running:
109
+ >
110
+ > ```bash
111
+ > source /opt/intel/openvino_2026/setupvars.sh
112
+ > source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
113
+ > export PYTHONPATH=/opt/intel/dlstreamer/python:\
114
+ > /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
115
+ > ```
116
+
117
+ Create the COCO labels file once (one class per line, in COCO order):
118
+
119
+ ```bash
120
+ python3 - <<'PY'
121
+ names = [
122
+ "person","bicycle","car","motorcycle","airplane","bus","train","truck",
123
+ "boat","traffic light","fire hydrant","stop sign","parking meter","bench",
124
+ "bird","cat","dog","horse","sheep","cow","elephant","bear","zebra",
125
+ "giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee",
126
+ "skis","snowboard","sports ball","kite","baseball bat","baseball glove",
127
+ "skateboard","surfboard","tennis racket","bottle","wine glass","cup",
128
+ "fork","knife","spoon","bowl","banana","apple","sandwich","orange",
129
+ "broccoli","carrot","hot dog","pizza","donut","cake","chair","couch",
130
+ "potted plant","bed","dining table","toilet","tv","laptop","mouse",
131
+ "remote","keyboard","cell phone","microwave","oven","toaster","sink",
132
+ "refrigerator","book","clock","vase","scissors","teddy bear","hair drier",
133
+ "toothbrush",
134
+ ]
135
+ open("coco.txt", "w").write("\n".join(names))
136
+ PY
137
+ ```
138
+
139
+ ```python
140
+ from collections import defaultdict
141
+
142
+ import cv2
143
+ import gi
144
+ import numpy as np
145
+
146
+ gi.require_version("Gst", "1.0")
147
+ gi.require_version("GstVideo", "1.0")
148
+ from gi.repository import Gst
149
+ from gstgva import VideoFrame
150
+
151
+ Gst.init(None)
152
+
153
+ MODEL_XML = "yolo11n_openvino_model/yolo11n.xml"
154
+ LABELS_FILE = "coco.txt"
155
+ INPUT_VIDEO = "test_video.mp4"
156
+ ZONE_POLYGON = np.array(
157
+ [(420, 380), (1500, 380), (1500, 540), (420, 540)], dtype=np.int32,
158
+ )
159
+ LOITERING_SECONDS = 10.0
160
+
161
+ pipeline_str = (
162
+ f"filesrc location={INPUT_VIDEO} ! decodebin ! videoconvert ! "
163
+ f"video/x-raw,format=BGR ! "
164
+ f"gvadetect model={MODEL_XML} labels-file={LABELS_FILE} device=CPU "
165
+ f"threshold=0.4 ! queue ! "
166
+ f"gvatrack tracking-type=short-term-imageless ! queue ! "
167
+ f"gvawatermark ! videoconvert ! autovideosink name=sink sync=false"
168
+ )
169
+ pipeline = Gst.parse_launch(pipeline_str)
170
+
171
+ dwell_state: dict[int, float] = defaultdict(float)
172
+ last_seen: dict[int, float] = {}
173
+ flagged: set[int] = set()
174
+
175
+
176
+ def point_in_zone(x: int, y: int) -> bool:
177
+ return cv2.pointPolygonTest(ZONE_POLYGON, (float(x), float(y)), False) >= 0
178
+
179
+
180
+ def on_buffer(pad, info):
181
+ buf = info.get_buffer()
182
+ caps = pad.get_current_caps()
183
+ frame = VideoFrame(buf, caps=caps)
184
+
185
+ # Use the buffer's presentation timestamp so dwell time tracks the source
186
+ # video clock and is independent of the sink's `sync` setting.
187
+ now = buf.pts / Gst.SECOND if buf.pts != Gst.CLOCK_TIME_NONE else 0.0
188
+ seen_ids: set[int] = set()
189
+
190
+ for region in frame.regions():
191
+ if region.label() != "person":
192
+ continue
193
+ object_id = region.object_id()
194
+ if object_id <= 0:
195
+ continue
196
+
197
+ rect = region.rect()
198
+ foot_x = int(rect.x + rect.w / 2)
199
+ foot_y = int(rect.y + rect.h)
200
+ seen_ids.add(object_id)
201
+
202
+ if not point_in_zone(foot_x, foot_y):
203
+ dwell_state.pop(object_id, None)
204
+ last_seen.pop(object_id, None)
205
+ flagged.discard(object_id)
206
+ continue
207
+
208
+ prev = last_seen.get(object_id, now)
209
+ dwell_state[object_id] += now - prev
210
+ last_seen[object_id] = now
211
+
212
+ if (
213
+ dwell_state[object_id] >= LOITERING_SECONDS
214
+ and object_id not in flagged
215
+ ):
216
+ flagged.add(object_id)
217
+ print(
218
+ f"LOITERING id={object_id} "
219
+ f"dwell={dwell_state[object_id]:.1f}s "
220
+ f"anchor=({foot_x},{foot_y})",
221
+ flush=True,
222
+ )
223
+
224
+ for stale in list(dwell_state):
225
+ if stale not in seen_ids:
226
+ dwell_state.pop(stale, None)
227
+ last_seen.pop(stale, None)
228
+ flagged.discard(stale)
229
+
230
+ return Gst.PadProbeReturn.OK
231
+
232
+
233
+ sink = pipeline.get_by_name("sink")
234
+ sink_pad = sink.get_static_pad("sink")
235
+ sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)
236
+
237
+ pipeline.set_state(Gst.State.PLAYING)
238
+ bus = pipeline.get_bus()
239
+ bus.timed_pop_filtered(
240
+ Gst.CLOCK_TIME_NONE,
241
+ Gst.MessageType.EOS | Gst.MessageType.ERROR,
242
+ )
243
+ pipeline.set_state(Gst.State.NULL)
244
+ ```
245
+
246
+ To run on integrated GPU, change `device=CPU` to `device=GPU` and use
247
+ `vapostproc` after `decodebin` for zero-copy color conversion.
248
+
249
+ ### Try It on a Sample Video
250
+
251
+ Download a publicly hosted Intel sample clip that shows people walking through a scene:
252
+
253
+ ```bash
254
+ wget -O test_video.mp4 \
255
+ https://github.com/intel-iot-devkit/sample-videos/raw/master/people-detection.mp4
256
+ ```
257
+
258
+ The clip is 768x432 at 12 fps and shows people walking briskly through the field of view rather than truly loitering, so use a small zone in the busy part of the frame and a short dwell threshold for a meaningful demo:
259
+
260
+ ```python
261
+ ZONE_POLYGON = np.array(
262
+ [(220, 180), (560, 180), (560, 360), (220, 360)], dtype=np.int32,
263
+ )
264
+ LOITERING_SECONDS = 1.5
265
+ ```
266
+
267
+ Run the DLStreamer sample above.
268
+ A window opened by `autovideosink` shows each frame with `gvawatermark` bounding boxes and persistent track IDs assigned by `gvatrack`.
269
+ With the threshold above, the buffer probe prints two events on this clip, for example:
270
+
271
+ ```text
272
+ LOITERING id=2 dwell=1.6s anchor=(529,258)
273
+ LOITERING id=9 dwell=1.6s anchor=(527,250)
274
+ ```
275
+
276
+ Increasing `LOITERING_SECONDS` back to its operational default (around 10 s) suppresses the events on this short walking clip; reproduce a real loitering scenario with a stationary subject in your own footage.
277
+
278
+ To capture the annotated output instead of viewing it live, replace `autovideosink` with an encoder branch such as `x264enc ! mp4mux ! filesink location=loitering_output.mp4`.
279
+
280
+ ---
281
+
282
+ ## License
283
+
284
+ Copyright (C) Intel Corporation. All rights reserved.
285
+ Licensed under the MIT License. See [LICENSE](LICENSE) for details.
286
+
287
+ ## References
288
+
289
+ - [YOLO11 Documentation](https://docs.ultralytics.com/models/yolo11/)
290
+ - [OpenVINO YOLO11 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov11-optimization/yolov11-object-detection.ipynb)
291
+ - [Intel DLStreamer Object Tracking](https://dlstreamer.github.io/elements/gvatrack.html)
292
+ - [OpenVINO Documentation](https://docs.openvino.ai/)
293
+ - [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
294
+ - [COCO Dataset](https://cocodataset.org/)
export_and_quantize.sh ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # SPDX-License-Identifier: MIT
3
+ # Copyright (C) Intel Corporation
4
+ #
5
+ # Export a YOLO11 person detector for loitering detection and quantize to INT8.
6
+ # Usage: ./export_and_quantize.sh [MODEL_VARIANT]
7
+ # Example: ./export_and_quantize.sh yolo11n
8
+
9
+ set -euo pipefail
10
+
11
+ MODEL_NAME="${1:-yolo11n}"
12
+
13
+ echo "--- Installing dependencies ---"
14
+ pip install -qU "openvino>=2026.0.0" "nncf>=3.0.0" ultralytics
15
+
16
+ echo "--- Exporting ${MODEL_NAME} to OpenVINO IR (FP16) ---"
17
+ python3 -c "
18
+ from ultralytics import YOLO
19
+
20
+ model = YOLO('${MODEL_NAME}.pt')
21
+ model.export(format='openvino', half=True, dynamic=False, imgsz=640)
22
+ print('Export complete: ${MODEL_NAME}_openvino_model/')
23
+ "
24
+
25
+ echo "--- Quantizing to INT8 with NNCF ---"
26
+ python3 -c "
27
+ import nncf
28
+ import openvino as ov
29
+ import numpy as np
30
+
31
+ core = ov.Core()
32
+ model = core.read_model('${MODEL_NAME}_openvino_model/${MODEL_NAME}.xml')
33
+
34
+ def transform_fn(data_item):
35
+ return np.random.rand(1, 3, 640, 640).astype(np.float32)
36
+
37
+ calibration_dataset = nncf.Dataset(list(range(300)), transform_fn)
38
+
39
+ quantized = nncf.quantize(
40
+ model,
41
+ calibration_dataset,
42
+ preset=nncf.QuantizationPreset.MIXED,
43
+ subset_size=300,
44
+ )
45
+
46
+ ov.save_model(quantized, '${MODEL_NAME}_loitering_int8.xml')
47
+ print('Quantization complete: ${MODEL_NAME}_loitering_int8.xml')
48
+ "
49
+
50
+ echo "--- Benchmarking ---"
51
+ benchmark_app -m "${MODEL_NAME}_loitering_int8.xml" -d CPU -niter 50 -api async
52
+
53
+ echo "--- Done ---"