Sync crowd-detection from metro-analytics-catalog

Browse files

Files changed (3) hide show

LICENSE +45 -0
README.md +250 -5
export_and_quantize.sh +53 -0

LICENSE CHANGED Viewed

	@@ -0,0 +1,45 @@

+This directory contains two categories of content under different licenses.
+Scripts and Documentation
+-------------------------
+The scripts (export_and_quantize.sh) and documentation (README.md) in this
+directory are original works by Intel Corporation, licensed under the
+MIT License.
+    Copyright (C) Intel Corporation
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to deal
+    in the Software without restriction, including without limitation the rights
+    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+    copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+    The above copyright notice and this permission notice shall be included in
+    all copies or substantial portions of the Software.
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+    THE SOFTWARE.
+YOLO11 Model
+------------
+The YOLO11 model weights and the Ultralytics framework are developed by
+Ultralytics and licensed under the GNU Affero General Public License v3.0
+(AGPL-3.0).
+    Source:  https://github.com/ultralytics/ultralytics
+    License: https://github.com/ultralytics/ultralytics/blob/main/LICENSE
+    Docs:    https://docs.ultralytics.com/models/yolo11/
+Users must comply with the AGPL-3.0 license terms when using, modifying,
+or distributing the YOLO11 model weights or Ultralytics software.
+For commercial licensing options, see https://www.ultralytics.com/license.

README.md CHANGED Viewed

@@ -1,5 +1,250 @@
----
-license: other
-license_name: other
-license_link: LICENSE
----

+# Crowd Detection -- Person Counting on Intel Hardware
+> **Reference notebook:** [yolov11-object-detection.ipynb](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov11-optimization/yolov11-object-detection.ipynb)
+>
+> **Validated with:** OpenVINO 2026.0.0, NNCF 3.0.0, Ultralytics 8.3.0, Python 3.11+
+| Property | Value |
+|---|---|
+| **Category** | Object Detection (Crowd / Person Counting) |
+| **Source Framework** | PyTorch (Ultralytics) |
+| **Supported Precisions** | FP16, FP16-INT8 |
+| **Inference Engine** | OpenVINO |
+| **Hardware** | CPU, GPU, NPU |
+| **Detected Class** | `person` (COCO class 0) |
+---
+## Overview
+Crowd Detection is a Metro Analytics use case that detects and counts people in video streams to estimate occupancy and identify crowd build-up.
+It is built on [YOLO11](https://docs.ultralytics.com/models/yolo11/), a real-time object detector trained on the COCO dataset, filtered at runtime to the `person` class.
+Typical Metro deployments include:
+- **Platform Occupancy** -- count waiting passengers on station platforms.
+- **Entry / Exit Flow** -- monitor pedestrian throughput at gates and turnstiles.
+- **Crowd Build-up Alerts** -- trigger notifications when person counts cross a threshold.
+- **Public Safety Analytics** -- support situational awareness in transit hubs and venues.
+Available variants: `yolo11n`, `yolo11s`, `yolo11m`, `yolo11l`, `yolo11x`.
+Smaller variants (`yolo11n`, `yolo11s`) are recommended for high-FPS edge deployment; larger variants improve recall in dense crowds.
+---
+## Prerequisites
+- [Install OpenVINO 2026.0.0](https://docs.openvino.ai/2026/get-started/install-openvino.html)
+- [Install Intel DLStreamer](https://dlstreamer.github.io/get_started/install/install-guide-ubuntu.html)
+---
+## Getting Started
+### Download and Quantize Model
+Run the provided script to download, export to OpenVINO IR (FP16), and quantize to INT8:
+```bash
+chmod +x export_and_quantize.sh
+./export_and_quantize.sh yolo11n
+```
+Replace `yolo11n` with any variant (`yolo11s`, `yolo11m`, `yolo11l`, `yolo11x`).
+The script performs the following steps:
+1. Installs dependencies (`openvino`, `nncf`, `ultralytics`).
+2. Downloads the PyTorch weights and exports to OpenVINO IR with `half=True`.
+3. Quantizes the model to INT8 using NNCF post-training quantization.
+4. Runs `benchmark_app` to validate throughput.
+Output files:
+- `yolo11n_openvino_model/` -- FP16 OpenVINO IR model directory.
+- `yolo11n_crowd_int8.xml` / `yolo11n_crowd_int8.bin` -- INT8 quantized model.
+> **Note:** For production accuracy, replace the random calibration tensors in
+> `export_and_quantize.sh` with a representative sample of frames from the
+> target deployment site.
+### OpenVINO Sample
+The sample below runs YOLO11 inference, filters to the `person` class, applies
+non-maximum suppression, and reports the crowd count for a single image.
+```python
+import cv2
+import numpy as np
+import openvino as ov
+PERSON_CLASS_ID = 0
+CONF_THRESHOLD = 0.4
+IOU_THRESHOLD = 0.5
+INPUT_SIZE = 640
+core = ov.Core()
+model = core.read_model("yolo11n_crowd_int8.xml")
+compiled = core.compile_model(model, "CPU")  # or "GPU", "NPU"
+image = cv2.imread("test.jpg")
+h0, w0 = image.shape[:2]
+# Preprocess: letterbox-free resize for simplicity.
+blob = cv2.resize(image, (INPUT_SIZE, INPUT_SIZE))
+blob = cv2.cvtColor(blob, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
+blob = blob.transpose(2, 0, 1)[np.newaxis, ...]  # NCHW
+# Infer. YOLO11 raw output shape: [1, 84, 8400] (xywh + 80 class scores).
+output = compiled([blob])[compiled.output(0)]
+preds = output[0].T  # [8400, 84]
+boxes_xywh = preds[:, :4]
+class_scores = preds[:, 4:]
+class_ids = class_scores.argmax(axis=1)
+confidences = class_scores.max(axis=1)
+mask = (class_ids == PERSON_CLASS_ID) & (confidences >= CONF_THRESHOLD)
+boxes_xywh = boxes_xywh[mask]
+confidences = confidences[mask]
+# Convert xywh (center) to xyxy in original image coordinates.
+sx, sy = w0 / INPUT_SIZE, h0 / INPUT_SIZE
+xyxy = np.empty_like(boxes_xywh)
+xyxy[:, 0] = (boxes_xywh[:, 0] - boxes_xywh[:, 2] / 2) * sx
+xyxy[:, 1] = (boxes_xywh[:, 1] - boxes_xywh[:, 3] / 2) * sy
+xyxy[:, 2] = (boxes_xywh[:, 0] + boxes_xywh[:, 2] / 2) * sx
+xyxy[:, 3] = (boxes_xywh[:, 1] + boxes_xywh[:, 3] / 2) * sy
+# Apply NMS to deduplicate overlapping detections.
+keep = cv2.dnn.NMSBoxes(
+    bboxes=[[float(x1), float(y1), float(x2 - x1), float(y2 - y1)]
+            for x1, y1, x2, y2 in xyxy],
+    scores=confidences.tolist(),
+    score_threshold=CONF_THRESHOLD,
+    nms_threshold=IOU_THRESHOLD,
+)
+crowd_count = len(keep)
+print(f"Detected persons: {crowd_count}")
+for i in np.array(keep).flatten():
+    x1, y1, x2, y2 = xyxy[i].astype(int)
+    cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
+cv2.putText(
+    image, f"Crowd count: {crowd_count}", (10, 30),
+    cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), 2,
+)
+cv2.imwrite("crowd_output.jpg", image)
+```
+### Try It on a Sample Image
+Download a public sample image that contains several people:
+```bash
+wget -O test.jpg https://ultralytics.com/images/bus.jpg
+```
+Re-run the OpenVINO sample above.
+The script reads `test.jpg`, prints the crowd count to the console, and writes the annotated frame to `crowd_output.jpg`.
+Expected console output (when running against the INT8 model produced by the script with the default random calibration tensors):
+```text
+Detected persons: 3
+```
+`crowd_output.jpg` is the same image with a green bounding box drawn around each detected person and the text `Crowd count: 3` overlaid in the top-left corner.
+The reference image actually contains four people; the FP16 IR (`yolo11n_openvino_model/yolo11n.xml`) detects all four.
+The INT8 model produced with random calibration data in `export_and_quantize.sh` typically detects three.
+Replace the random calibration tensors with representative frames from your deployment site to recover the missing detection.
+### DLStreamer Sample
+The pipeline below runs the FP16 YOLO11 detector on a video file via
+`gvadetect`, filters detections to the `person` class in a buffer probe using
+the DLStreamer Python bindings (`gstgva.VideoFrame`), overlays bounding boxes,
+and prints the per-frame crowd count.
+> **Notes on running this sample:**
+>
+> - Use the FP16 IR (`yolo11n_openvino_model/yolo11n.xml`).
+>   On DLStreamer 2026.0.0, `gvadetect` cannot auto-derive a YOLO post-processor
+>   from the INT8 model produced by the bundled script.
+>   To use the INT8 model, supply a matching `model-proc` JSON.
+> - `gvadetect` requires `labels-file=` to map class indices to names.
+> - Filtering with `object-class=person` directly on `gvadetect` is rejected
+>   when `inference-region` is `full-frame` (the default), so the sample
+>   filters by `region.label()` in the buffer probe instead.
+> - Export `PYTHONPATH` so the DLStreamer Python module is importable:
+>
+>   ```bash
+>   source /opt/intel/openvino_2026/setupvars.sh
+>   source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
+>   export PYTHONPATH=/opt/intel/dlstreamer/python:\
+>   /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
+>   ```
+>
+> Create `coco.txt` once with the 80 COCO class names in COCO order, one per
+> line (see the loitering-detection README for a ready-to-paste snippet).
+```python
+import gi
+gi.require_version("Gst", "1.0")
+gi.require_version("GstVideo", "1.0")
+from gi.repository import Gst
+from gstgva import VideoFrame
+Gst.init(None)
+pipeline_str = (
+    "filesrc location=test_video.mp4 ! decodebin ! videoconvert ! "
+    "video/x-raw,format=BGR ! "
+    "gvadetect model=yolo11n_openvino_model/yolo11n.xml "
+    "labels-file=coco.txt device=CPU threshold=0.4 ! queue ! "
+    "gvawatermark ! videoconvert ! autovideosink name=sink sync=false"
+)
+pipeline = Gst.parse_launch(pipeline_str)
+def on_buffer(pad, info):
+    buf = info.get_buffer()
+    caps = pad.get_current_caps()
+    frame = VideoFrame(buf, caps=caps)
+    crowd_count = sum(1 for r in frame.regions() if r.label() == "person")
+    if crowd_count:
+        print(f"Crowd count (frame): {crowd_count}", flush=True)
+    return Gst.PadProbeReturn.OK
+sink = pipeline.get_by_name("sink")
+sink_pad = sink.get_static_pad("sink")
+sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)
+pipeline.set_state(Gst.State.PLAYING)
+bus = pipeline.get_bus()
+bus.timed_pop_filtered(
+    Gst.CLOCK_TIME_NONE,
+    Gst.MessageType.EOS | Gst.MessageType.ERROR,
+)
+pipeline.set_state(Gst.State.NULL)
+```
+---
+## License
+Copyright (C) Intel Corporation. All rights reserved.
+Licensed under the MIT License. See [LICENSE](LICENSE) for details.
+## References
+- [YOLO11 Documentation](https://docs.ultralytics.com/models/yolo11/)
+- [OpenVINO YOLO11 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov11-optimization/yolov11-object-detection.ipynb)
+- [COCO Dataset](https://cocodataset.org/)
+- [OpenVINO Documentation](https://docs.openvino.ai/)
+- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
+- [Intel DLStreamer](https://dlstreamer.github.io/)

export_and_quantize.sh ADDED Viewed

	@@ -0,0 +1,53 @@

+#!/usr/bin/env bash
+# SPDX-License-Identifier: MIT
+# Copyright (C) Intel Corporation
+#
+# Export a YOLO11 person detector for crowd detection and quantize to INT8.
+# Usage: ./export_and_quantize.sh [MODEL_VARIANT]
+# Example: ./export_and_quantize.sh yolo11n
+set -euo pipefail
+MODEL_NAME="${1:-yolo11n}"
+echo "--- Installing dependencies ---"
+pip install -qU "openvino>=2026.0.0" "nncf>=3.0.0" ultralytics
+echo "--- Exporting ${MODEL_NAME} to OpenVINO IR (FP16) ---"
+python3 -c "
+from ultralytics import YOLO
+model = YOLO('${MODEL_NAME}.pt')
+model.export(format='openvino', half=True, dynamic=False, imgsz=640)
+print('Export complete: ${MODEL_NAME}_openvino_model/')
+"
+echo "--- Quantizing to INT8 with NNCF ---"
+python3 -c "
+import nncf
+import openvino as ov
+import numpy as np
+core = ov.Core()
+model = core.read_model('${MODEL_NAME}_openvino_model/${MODEL_NAME}.xml')
+def transform_fn(data_item):
+    return np.random.rand(1, 3, 640, 640).astype(np.float32)
+calibration_dataset = nncf.Dataset(list(range(300)), transform_fn)
+quantized = nncf.quantize(
+    model,
+    calibration_dataset,
+    preset=nncf.QuantizationPreset.MIXED,
+    subset_size=300,
+)
+ov.save_model(quantized, '${MODEL_NAME}_crowd_int8.xml')
+print('Quantization complete: ${MODEL_NAME}_crowd_int8.xml')
+"
+echo "--- Benchmarking ---"
+benchmark_app -m "${MODEL_NAME}_crowd_int8.xml" -d CPU -niter 50 -api async
+echo "--- Done ---"