vagheshpatel commited on
Commit
6ad459b
·
verified ·
1 Parent(s): 448b0d9

Sync crowd-detection from metro-analytics-catalog

Browse files
Files changed (3) hide show
  1. LICENSE +45 -0
  2. README.md +250 -5
  3. export_and_quantize.sh +53 -0
LICENSE CHANGED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This directory contains two categories of content under different licenses.
2
+
3
+
4
+ Scripts and Documentation
5
+ -------------------------
6
+
7
+ The scripts (export_and_quantize.sh) and documentation (README.md) in this
8
+ directory are original works by Intel Corporation, licensed under the
9
+ MIT License.
10
+
11
+ Copyright (C) Intel Corporation
12
+
13
+ Permission is hereby granted, free of charge, to any person obtaining a copy
14
+ of this software and associated documentation files (the "Software"), to deal
15
+ in the Software without restriction, including without limitation the rights
16
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
17
+ copies of the Software, and to permit persons to whom the Software is
18
+ furnished to do so, subject to the following conditions:
19
+
20
+ The above copyright notice and this permission notice shall be included in
21
+ all copies or substantial portions of the Software.
22
+
23
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
24
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
25
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
26
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
27
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
28
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
29
+ THE SOFTWARE.
30
+
31
+
32
+ YOLO11 Model
33
+ ------------
34
+
35
+ The YOLO11 model weights and the Ultralytics framework are developed by
36
+ Ultralytics and licensed under the GNU Affero General Public License v3.0
37
+ (AGPL-3.0).
38
+
39
+ Source: https://github.com/ultralytics/ultralytics
40
+ License: https://github.com/ultralytics/ultralytics/blob/main/LICENSE
41
+ Docs: https://docs.ultralytics.com/models/yolo11/
42
+
43
+ Users must comply with the AGPL-3.0 license terms when using, modifying,
44
+ or distributing the YOLO11 model weights or Ultralytics software.
45
+ For commercial licensing options, see https://www.ultralytics.com/license.
README.md CHANGED
@@ -1,5 +1,250 @@
1
- ---
2
- license: other
3
- license_name: other
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Crowd Detection -- Person Counting on Intel Hardware
2
+
3
+ > **Reference notebook:** [yolov11-object-detection.ipynb](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov11-optimization/yolov11-object-detection.ipynb)
4
+ >
5
+ > **Validated with:** OpenVINO 2026.0.0, NNCF 3.0.0, Ultralytics 8.3.0, Python 3.11+
6
+
7
+ | Property | Value |
8
+ |---|---|
9
+ | **Category** | Object Detection (Crowd / Person Counting) |
10
+ | **Source Framework** | PyTorch (Ultralytics) |
11
+ | **Supported Precisions** | FP16, FP16-INT8 |
12
+ | **Inference Engine** | OpenVINO |
13
+ | **Hardware** | CPU, GPU, NPU |
14
+ | **Detected Class** | `person` (COCO class 0) |
15
+
16
+ ---
17
+
18
+ ## Overview
19
+
20
+ Crowd Detection is a Metro Analytics use case that detects and counts people in video streams to estimate occupancy and identify crowd build-up.
21
+ It is built on [YOLO11](https://docs.ultralytics.com/models/yolo11/), a real-time object detector trained on the COCO dataset, filtered at runtime to the `person` class.
22
+ Typical Metro deployments include:
23
+
24
+ - **Platform Occupancy** -- count waiting passengers on station platforms.
25
+ - **Entry / Exit Flow** -- monitor pedestrian throughput at gates and turnstiles.
26
+ - **Crowd Build-up Alerts** -- trigger notifications when person counts cross a threshold.
27
+ - **Public Safety Analytics** -- support situational awareness in transit hubs and venues.
28
+
29
+ Available variants: `yolo11n`, `yolo11s`, `yolo11m`, `yolo11l`, `yolo11x`.
30
+ Smaller variants (`yolo11n`, `yolo11s`) are recommended for high-FPS edge deployment; larger variants improve recall in dense crowds.
31
+
32
+ ---
33
+
34
+ ## Prerequisites
35
+
36
+ - [Install OpenVINO 2026.0.0](https://docs.openvino.ai/2026/get-started/install-openvino.html)
37
+ - [Install Intel DLStreamer](https://dlstreamer.github.io/get_started/install/install-guide-ubuntu.html)
38
+
39
+ ---
40
+
41
+ ## Getting Started
42
+
43
+ ### Download and Quantize Model
44
+
45
+ Run the provided script to download, export to OpenVINO IR (FP16), and quantize to INT8:
46
+
47
+ ```bash
48
+ chmod +x export_and_quantize.sh
49
+ ./export_and_quantize.sh yolo11n
50
+ ```
51
+
52
+ Replace `yolo11n` with any variant (`yolo11s`, `yolo11m`, `yolo11l`, `yolo11x`).
53
+
54
+ The script performs the following steps:
55
+
56
+ 1. Installs dependencies (`openvino`, `nncf`, `ultralytics`).
57
+ 2. Downloads the PyTorch weights and exports to OpenVINO IR with `half=True`.
58
+ 3. Quantizes the model to INT8 using NNCF post-training quantization.
59
+ 4. Runs `benchmark_app` to validate throughput.
60
+
61
+ Output files:
62
+
63
+ - `yolo11n_openvino_model/` -- FP16 OpenVINO IR model directory.
64
+ - `yolo11n_crowd_int8.xml` / `yolo11n_crowd_int8.bin` -- INT8 quantized model.
65
+
66
+ > **Note:** For production accuracy, replace the random calibration tensors in
67
+ > `export_and_quantize.sh` with a representative sample of frames from the
68
+ > target deployment site.
69
+
70
+ ### OpenVINO Sample
71
+
72
+ The sample below runs YOLO11 inference, filters to the `person` class, applies
73
+ non-maximum suppression, and reports the crowd count for a single image.
74
+
75
+ ```python
76
+ import cv2
77
+ import numpy as np
78
+ import openvino as ov
79
+
80
+ PERSON_CLASS_ID = 0
81
+ CONF_THRESHOLD = 0.4
82
+ IOU_THRESHOLD = 0.5
83
+ INPUT_SIZE = 640
84
+
85
+ core = ov.Core()
86
+ model = core.read_model("yolo11n_crowd_int8.xml")
87
+ compiled = core.compile_model(model, "CPU") # or "GPU", "NPU"
88
+
89
+ image = cv2.imread("test.jpg")
90
+ h0, w0 = image.shape[:2]
91
+
92
+ # Preprocess: letterbox-free resize for simplicity.
93
+ blob = cv2.resize(image, (INPUT_SIZE, INPUT_SIZE))
94
+ blob = cv2.cvtColor(blob, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
95
+ blob = blob.transpose(2, 0, 1)[np.newaxis, ...] # NCHW
96
+
97
+ # Infer. YOLO11 raw output shape: [1, 84, 8400] (xywh + 80 class scores).
98
+ output = compiled([blob])[compiled.output(0)]
99
+ preds = output[0].T # [8400, 84]
100
+
101
+ boxes_xywh = preds[:, :4]
102
+ class_scores = preds[:, 4:]
103
+ class_ids = class_scores.argmax(axis=1)
104
+ confidences = class_scores.max(axis=1)
105
+
106
+ mask = (class_ids == PERSON_CLASS_ID) & (confidences >= CONF_THRESHOLD)
107
+ boxes_xywh = boxes_xywh[mask]
108
+ confidences = confidences[mask]
109
+
110
+ # Convert xywh (center) to xyxy in original image coordinates.
111
+ sx, sy = w0 / INPUT_SIZE, h0 / INPUT_SIZE
112
+ xyxy = np.empty_like(boxes_xywh)
113
+ xyxy[:, 0] = (boxes_xywh[:, 0] - boxes_xywh[:, 2] / 2) * sx
114
+ xyxy[:, 1] = (boxes_xywh[:, 1] - boxes_xywh[:, 3] / 2) * sy
115
+ xyxy[:, 2] = (boxes_xywh[:, 0] + boxes_xywh[:, 2] / 2) * sx
116
+ xyxy[:, 3] = (boxes_xywh[:, 1] + boxes_xywh[:, 3] / 2) * sy
117
+
118
+ # Apply NMS to deduplicate overlapping detections.
119
+ keep = cv2.dnn.NMSBoxes(
120
+ bboxes=[[float(x1), float(y1), float(x2 - x1), float(y2 - y1)]
121
+ for x1, y1, x2, y2 in xyxy],
122
+ scores=confidences.tolist(),
123
+ score_threshold=CONF_THRESHOLD,
124
+ nms_threshold=IOU_THRESHOLD,
125
+ )
126
+
127
+ crowd_count = len(keep)
128
+ print(f"Detected persons: {crowd_count}")
129
+
130
+ for i in np.array(keep).flatten():
131
+ x1, y1, x2, y2 = xyxy[i].astype(int)
132
+ cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
133
+
134
+ cv2.putText(
135
+ image, f"Crowd count: {crowd_count}", (10, 30),
136
+ cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), 2,
137
+ )
138
+ cv2.imwrite("crowd_output.jpg", image)
139
+ ```
140
+
141
+ ### Try It on a Sample Image
142
+
143
+ Download a public sample image that contains several people:
144
+
145
+ ```bash
146
+ wget -O test.jpg https://ultralytics.com/images/bus.jpg
147
+ ```
148
+
149
+ Re-run the OpenVINO sample above.
150
+ The script reads `test.jpg`, prints the crowd count to the console, and writes the annotated frame to `crowd_output.jpg`.
151
+
152
+ Expected console output (when running against the INT8 model produced by the script with the default random calibration tensors):
153
+
154
+ ```text
155
+ Detected persons: 3
156
+ ```
157
+
158
+ `crowd_output.jpg` is the same image with a green bounding box drawn around each detected person and the text `Crowd count: 3` overlaid in the top-left corner.
159
+
160
+ The reference image actually contains four people; the FP16 IR (`yolo11n_openvino_model/yolo11n.xml`) detects all four.
161
+ The INT8 model produced with random calibration data in `export_and_quantize.sh` typically detects three.
162
+ Replace the random calibration tensors with representative frames from your deployment site to recover the missing detection.
163
+
164
+ ### DLStreamer Sample
165
+
166
+ The pipeline below runs the FP16 YOLO11 detector on a video file via
167
+ `gvadetect`, filters detections to the `person` class in a buffer probe using
168
+ the DLStreamer Python bindings (`gstgva.VideoFrame`), overlays bounding boxes,
169
+ and prints the per-frame crowd count.
170
+
171
+ > **Notes on running this sample:**
172
+ >
173
+ > - Use the FP16 IR (`yolo11n_openvino_model/yolo11n.xml`).
174
+ > On DLStreamer 2026.0.0, `gvadetect` cannot auto-derive a YOLO post-processor
175
+ > from the INT8 model produced by the bundled script.
176
+ > To use the INT8 model, supply a matching `model-proc` JSON.
177
+ > - `gvadetect` requires `labels-file=` to map class indices to names.
178
+ > - Filtering with `object-class=person` directly on `gvadetect` is rejected
179
+ > when `inference-region` is `full-frame` (the default), so the sample
180
+ > filters by `region.label()` in the buffer probe instead.
181
+ > - Export `PYTHONPATH` so the DLStreamer Python module is importable:
182
+ >
183
+ > ```bash
184
+ > source /opt/intel/openvino_2026/setupvars.sh
185
+ > source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
186
+ > export PYTHONPATH=/opt/intel/dlstreamer/python:\
187
+ > /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
188
+ > ```
189
+ >
190
+ > Create `coco.txt` once with the 80 COCO class names in COCO order, one per
191
+ > line (see the loitering-detection README for a ready-to-paste snippet).
192
+
193
+ ```python
194
+ import gi
195
+
196
+ gi.require_version("Gst", "1.0")
197
+ gi.require_version("GstVideo", "1.0")
198
+ from gi.repository import Gst
199
+ from gstgva import VideoFrame
200
+
201
+ Gst.init(None)
202
+
203
+ pipeline_str = (
204
+ "filesrc location=test_video.mp4 ! decodebin ! videoconvert ! "
205
+ "video/x-raw,format=BGR ! "
206
+ "gvadetect model=yolo11n_openvino_model/yolo11n.xml "
207
+ "labels-file=coco.txt device=CPU threshold=0.4 ! queue ! "
208
+ "gvawatermark ! videoconvert ! autovideosink name=sink sync=false"
209
+ )
210
+ pipeline = Gst.parse_launch(pipeline_str)
211
+
212
+
213
+ def on_buffer(pad, info):
214
+ buf = info.get_buffer()
215
+ caps = pad.get_current_caps()
216
+ frame = VideoFrame(buf, caps=caps)
217
+ crowd_count = sum(1 for r in frame.regions() if r.label() == "person")
218
+ if crowd_count:
219
+ print(f"Crowd count (frame): {crowd_count}", flush=True)
220
+ return Gst.PadProbeReturn.OK
221
+
222
+
223
+ sink = pipeline.get_by_name("sink")
224
+ sink_pad = sink.get_static_pad("sink")
225
+ sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)
226
+
227
+ pipeline.set_state(Gst.State.PLAYING)
228
+ bus = pipeline.get_bus()
229
+ bus.timed_pop_filtered(
230
+ Gst.CLOCK_TIME_NONE,
231
+ Gst.MessageType.EOS | Gst.MessageType.ERROR,
232
+ )
233
+ pipeline.set_state(Gst.State.NULL)
234
+ ```
235
+
236
+ ---
237
+
238
+ ## License
239
+
240
+ Copyright (C) Intel Corporation. All rights reserved.
241
+ Licensed under the MIT License. See [LICENSE](LICENSE) for details.
242
+
243
+ ## References
244
+
245
+ - [YOLO11 Documentation](https://docs.ultralytics.com/models/yolo11/)
246
+ - [OpenVINO YOLO11 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov11-optimization/yolov11-object-detection.ipynb)
247
+ - [COCO Dataset](https://cocodataset.org/)
248
+ - [OpenVINO Documentation](https://docs.openvino.ai/)
249
+ - [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
250
+ - [Intel DLStreamer](https://dlstreamer.github.io/)
export_and_quantize.sh ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # SPDX-License-Identifier: MIT
3
+ # Copyright (C) Intel Corporation
4
+ #
5
+ # Export a YOLO11 person detector for crowd detection and quantize to INT8.
6
+ # Usage: ./export_and_quantize.sh [MODEL_VARIANT]
7
+ # Example: ./export_and_quantize.sh yolo11n
8
+
9
+ set -euo pipefail
10
+
11
+ MODEL_NAME="${1:-yolo11n}"
12
+
13
+ echo "--- Installing dependencies ---"
14
+ pip install -qU "openvino>=2026.0.0" "nncf>=3.0.0" ultralytics
15
+
16
+ echo "--- Exporting ${MODEL_NAME} to OpenVINO IR (FP16) ---"
17
+ python3 -c "
18
+ from ultralytics import YOLO
19
+
20
+ model = YOLO('${MODEL_NAME}.pt')
21
+ model.export(format='openvino', half=True, dynamic=False, imgsz=640)
22
+ print('Export complete: ${MODEL_NAME}_openvino_model/')
23
+ "
24
+
25
+ echo "--- Quantizing to INT8 with NNCF ---"
26
+ python3 -c "
27
+ import nncf
28
+ import openvino as ov
29
+ import numpy as np
30
+
31
+ core = ov.Core()
32
+ model = core.read_model('${MODEL_NAME}_openvino_model/${MODEL_NAME}.xml')
33
+
34
+ def transform_fn(data_item):
35
+ return np.random.rand(1, 3, 640, 640).astype(np.float32)
36
+
37
+ calibration_dataset = nncf.Dataset(list(range(300)), transform_fn)
38
+
39
+ quantized = nncf.quantize(
40
+ model,
41
+ calibration_dataset,
42
+ preset=nncf.QuantizationPreset.MIXED,
43
+ subset_size=300,
44
+ )
45
+
46
+ ov.save_model(quantized, '${MODEL_NAME}_crowd_int8.xml')
47
+ print('Quantization complete: ${MODEL_NAME}_crowd_int8.xml')
48
+ "
49
+
50
+ echo "--- Benchmarking ---"
51
+ benchmark_app -m "${MODEL_NAME}_crowd_int8.xml" -d CPU -niter 50 -api async
52
+
53
+ echo "--- Done ---"