--- license: other license_name: intel-custom license_link: LICENSE library_name: openvino pipeline_tag: object-detection tags: - openvino - intel - yolo - yolo26 - crowd-detection - person-counting - edge-ai - metro - dlstreamer datasets: - detection-datasets/coco language: - en --- # Crowd Detection | Property | Value | |---|---| | **Category** | Object Detection (Crowd / Person Counting) | | **Base Model** | [YOLO26](https://docs.ultralytics.com/models/yolo26/) (Ultralytics) | | **Source Framework** | PyTorch (Ultralytics) | | **Supported Precisions** | FP32, FP16, INT8 (mixed-precision) | | **Inference Engine** | OpenVINO | | **Hardware** | CPU, GPU, NPU | | **Detected Class** | `person` (COCO class 0) | --- ## Overview Crowd Detection is a Metro Analytics use case that detects and counts people in video streams to estimate occupancy and identify crowd build-up. It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector trained on the COCO dataset, quantized to INT8 and filtered at runtime to the `person` class. Typical Metro deployments include: - **Platform Occupancy** -- count waiting passengers on station platforms. - **Entry / Exit Flow** -- monitor pedestrian throughput at gates and turnstiles. - **Crowd Build-up Alerts** -- trigger notifications when person counts cross a threshold. - **Public Safety Analytics** -- support situational awareness in transit hubs and venues. Available variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`. Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment; larger variants improve recall in dense crowds. --- ## Prerequisites - Python 3.11+ - [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version) - [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) Create and activate a Python virtual environment before running the scripts: ```bash python3 -m venv .venv --system-site-packages source .venv/bin/activate ``` > **Note:** The `--system-site-packages` flag is required so the virtual > environment can access the system-installed OpenVINO and DLStreamer Python > packages. --- ## Getting Started ### Download and Quantize Model Run the provided script to download, export to OpenVINO IR, and optionally quantize: ```bash chmod +x export_and_quantize.sh ./export_and_quantize.sh ``` This exports the default **yolo26n** model in **FP16** precision. #### Optional: Select a Different Variant or Precision ```bash ./export_and_quantize.sh yolo26n FP32 # full-precision ./export_and_quantize.sh yolo26n INT8 # quantized ./export_and_quantize.sh yolo26s # larger variant, default FP16 ``` Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`). The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is **FP16**. The script performs the following steps: 1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8). 2. Downloads a sample test image (`test.jpg`) and a sample test video (`test_video.mp4`). 3. Downloads the PyTorch weights and exports to OpenVINO IR. 4. *(INT8 only)* Quantizes the model using NNCF post-training quantization. Output files: - `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory. - `yolo26n_crowd_int8.xml` / `yolo26n_crowd_int8.bin` -- INT8 quantized model *(only when `INT8` is selected)*. #### Precision / Device Compatibility | Precision | CPU | GPU | NPU | |---|---|---|---| | FP32 | Yes | Yes | No | | FP16 | Yes | Yes | Yes | | INT8 | Yes | Yes | Yes | > **Note:** The INT8 calibration uses the bundled sample image. > For production accuracy, replace it with a representative set of frames from > the target deployment site. ### OpenVINO Sample The sample below runs YOLO26 inference, filters to the `person` class, applies non-maximum suppression, and reports the crowd count for a single image. ```python import cv2 import numpy as np import openvino as ov PERSON_CLASS_ID = 0 CONF_THRESHOLD = 0.4 INPUT_SIZE = 640 core = ov.Core() model = core.read_model("yolo26n_openvino_model/yolo26n.xml") compiled = core.compile_model(model, "CPU") # or "GPU", "NPU" image = cv2.imread("test.jpg") h0, w0 = image.shape[:2] # Preprocess: letterbox-free resize for simplicity. blob = cv2.resize(image, (INPUT_SIZE, INPUT_SIZE)) blob = cv2.cvtColor(blob, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0 blob = blob.transpose(2, 0, 1)[np.newaxis, ...] # NCHW # YOLO26 end-to-end output: [1, 300, 6] = [x1, y1, x2, y2, confidence, class_id] # No NMS is needed -- YOLO26 is natively end-to-end. output = compiled([blob])[compiled.output(0)][0] mask = (output[:, 4] >= CONF_THRESHOLD) & (output[:, 5].astype(int) == PERSON_CLASS_ID) dets = output[mask] sx, sy = w0 / INPUT_SIZE, h0 / INPUT_SIZE crowd_count = len(dets) print(f"Detected persons: {crowd_count}") for det in dets: x1 = int(det[0] * sx) y1 = int(det[1] * sy) x2 = int(det[2] * sx) y2 = int(det[3] * sy) cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2) cv2.putText( image, f"Crowd count: {crowd_count}", (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), 2, ) cv2.imwrite("output_openvino.jpg", image) ``` ### Try It on a Sample Image The `export_and_quantize.sh` script downloads `test.jpg` automatically. Re-run the OpenVINO sample above. The script reads `test.jpg`, prints the crowd count to the console, and writes the annotated frame to `output_openvino.jpg`. Expected console output: ```text Detected persons: 4 ``` `output_openvino.jpg` is the same image with a green bounding box drawn around each detected person and the text `Crowd count: 4` overlaid in the top-left corner. > **Tip:** For production testing, replace the bundled `test.jpg` with an image > from your target deployment site showing a representative crowd density. #### Expected Output ![OpenVINO expected output](expected_output_openvino.jpg) ### DLStreamer Sample The pipeline below runs the FP16 YOLO26 detector on the sample video via `gvadetect`, filters detections to the `person` class in a buffer probe using the DLStreamer Python bindings (`gstgva.VideoFrame`), overlays bounding boxes, saves the annotated result to `output_dlstreamer.mp4`, and prints the crowd count per frame. > **Notes on running this sample:** > > - Use the FP16 IR (`yolo26n_openvino_model/yolo26n.xml`). > On DLStreamer 2026.0.0, `gvadetect` cannot auto-derive a YOLO post-processor > from the INT8 model produced by the bundled script. > To use the INT8 model, supply a matching `model-proc` JSON. > - Class names are read automatically from the model's embedded > `metadata.yaml` by DLStreamer 2026.0+ -- no external `labels-file` is > required. > - Filtering with `object-class=person` directly on `gvadetect` is rejected > when `inference-region` is `full-frame` (the default), so the sample > filters by `region.label()` in the buffer probe instead. > - Export `PYTHONPATH` so the DLStreamer Python module is importable: > > ```bash > source /opt/intel/openvino_2026/setupvars.sh > source /opt/intel/dlstreamer/scripts/setup_dls_env.sh > export PYTHONPATH=/opt/intel/dlstreamer/python:\ > /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-} > ``` ```python import gi gi.require_version("Gst", "1.0") gi.require_version("GstVideo", "1.0") from gi.repository import Gst from gstgva import VideoFrame Gst.init(None) INPUT_VIDEO = "test_video.mp4" # For CPU: change device=GPU to device=CPU. # For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended). pipeline_str = ( f"filesrc location={INPUT_VIDEO} ! decodebin3 ! " "videoconvert ! " "gvadetect model=yolo26n_openvino_model/yolo26n.xml " "device=GPU " "threshold=0.4 ! queue ! " "gvawatermark ! videoconvert ! video/x-raw,format=I420 ! " "openh264enc ! h264parse ! " "mp4mux ! filesink name=sink location=output_dlstreamer.mp4" ) pipeline = Gst.parse_launch(pipeline_str) sink = pipeline.get_by_name("sink") sink_pad = sink.get_static_pad("sink") def on_buffer(pad, info): buf = info.get_buffer() caps = pad.get_current_caps() frame = VideoFrame(buf, caps=caps) crowd_count = sum(1 for r in frame.regions() if r.label() == "person") if crowd_count: print(f"Crowd count: {crowd_count}", flush=True) return Gst.PadProbeReturn.OK sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer) pipeline.set_state(Gst.State.PLAYING) bus = pipeline.get_bus() bus.timed_pop_filtered( Gst.CLOCK_TIME_NONE, Gst.MessageType.EOS | Gst.MessageType.ERROR, ) pipeline.set_state(Gst.State.NULL) ``` #### Expected Output ![DLStreamer expected output](expected_output_dlstreamer.gif) **Device targets:** - `device=GPU` -- default in the sample code. - `device=CPU` -- change `device=GPU` to `device=CPU`. - `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization. --- ## License Copyright (C) Intel Corporation. All rights reserved. Licensed under the MIT License. See [LICENSE](LICENSE) for details. ## References - [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/) - [OpenVINO YOLO26 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov26-optimization/yolov26-object-detection.ipynb) - [COCO Dataset](https://cocodataset.org/) - [OpenVINO Documentation](https://docs.openvino.ai/) - [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html) - [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html)