Crowd Detection

Property	Value
Category	Object Detection (Crowd / Person Counting)
Base Model	YOLO26 (Ultralytics)
Source Framework	PyTorch (Ultralytics)
Supported Precisions	FP32, FP16, INT8 (mixed-precision)
Inference Engine	OpenVINO
Hardware	CPU, GPU, NPU
Detected Class	`person` (COCO class 0)

Overview

Crowd Detection is a Metro Analytics use case that detects and counts people in video streams to estimate occupancy and identify crowd build-up. It is built on YOLO26, a state-of-the-art real-time object detector trained on the COCO dataset, quantized to INT8 and filtered at runtime to the person class. Typical Metro deployments include:

Platform Occupancy -- count waiting passengers on station platforms.
Entry / Exit Flow -- monitor pedestrian throughput at gates and turnstiles.
Crowd Build-up Alerts -- trigger notifications when person counts cross a threshold.
Public Safety Analytics -- support situational awareness in transit hubs and venues.

Available variants: yolo26n, yolo26s, yolo26m, yolo26l, yolo26x. Smaller variants (yolo26n, yolo26s) are recommended for high-FPS edge deployment; larger variants improve recall in dense crowds.

Prerequisites

Python 3.11+
Install OpenVINO (latest version)
Install Intel DLStreamer

Create and activate a Python virtual environment before running the scripts:

python3 -m venv .venv --system-site-packages
source .venv/bin/activate

Note: The --system-site-packages flag is required so the virtual environment can access the system-installed OpenVINO and DLStreamer Python packages.

Getting Started

Download and Quantize Model

Run the provided script to download, export to OpenVINO IR, and optionally quantize:

chmod +x export_and_quantize.sh
./export_and_quantize.sh

This exports the default yolo26n model in FP16 precision.

Optional: Select a Different Variant or Precision

./export_and_quantize.sh yolo26n FP32   # full-precision
./export_and_quantize.sh yolo26n INT8   # quantized
./export_and_quantize.sh yolo26s        # larger variant, default FP16

Replace yolo26n with any variant (yolo26s, yolo26m, yolo26l, yolo26x). The second argument selects the precision (FP32, FP16, INT8); the default is FP16.

The script performs the following steps:

Installs dependencies (openvino, ultralytics; adds nncf for INT8).
Downloads a sample test image (test.jpg) and a sample test video (test_video.mp4).
Downloads the PyTorch weights and exports to OpenVINO IR.
(INT8 only) Quantizes the model using NNCF post-training quantization.

Output files:

yolo26n_openvino_model/ -- FP32 or FP16 OpenVINO IR model directory.
yolo26n_crowd_int8.xml / yolo26n_crowd_int8.bin -- INT8 quantized model (only when INT8 is selected).

Precision / Device Compatibility

Precision	CPU	GPU	NPU
FP32	Yes	Yes	No
FP16	Yes	Yes	Yes
INT8	Yes	Yes	Yes

Note: The INT8 calibration uses the bundled sample image. For production accuracy, replace it with a representative set of frames from the target deployment site.

OpenVINO Sample

The sample below runs YOLO26 inference, filters to the person class, applies non-maximum suppression, and reports the crowd count for a single image.

import cv2
import numpy as np
import openvino as ov

PERSON_CLASS_ID = 0
CONF_THRESHOLD = 0.4
INPUT_SIZE = 640

core = ov.Core()
model = core.read_model("yolo26n_openvino_model/yolo26n.xml")
compiled = core.compile_model(model, "CPU")  # or "GPU", "NPU"

image = cv2.imread("test.jpg")
h0, w0 = image.shape[:2]

# Preprocess: letterbox-free resize for simplicity.
blob = cv2.resize(image, (INPUT_SIZE, INPUT_SIZE))
blob = cv2.cvtColor(blob, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
blob = blob.transpose(2, 0, 1)[np.newaxis, ...]  # NCHW

# YOLO26 end-to-end output: [1, 300, 6] = [x1, y1, x2, y2, confidence, class_id]
# No NMS is needed -- YOLO26 is natively end-to-end.
output = compiled([blob])[compiled.output(0)][0]
mask = (output[:, 4] >= CONF_THRESHOLD) & (output[:, 5].astype(int) == PERSON_CLASS_ID)
dets = output[mask]

sx, sy = w0 / INPUT_SIZE, h0 / INPUT_SIZE
crowd_count = len(dets)
print(f"Detected persons: {crowd_count}")

for det in dets:
    x1 = int(det[0] * sx)
    y1 = int(det[1] * sy)
    x2 = int(det[2] * sx)
    y2 = int(det[3] * sy)
    cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)

cv2.putText(
    image, f"Crowd count: {crowd_count}", (10, 30),
    cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), 2,
)
cv2.imwrite("output_openvino.jpg", image)

Try It on a Sample Image

The export_and_quantize.sh script downloads test.jpg automatically. Re-run the OpenVINO sample above. The script reads test.jpg, prints the crowd count to the console, and writes the annotated frame to output_openvino.jpg.

Expected console output:

Detected persons: 4

output_openvino.jpg is the same image with a green bounding box drawn around each detected person and the text Crowd count: 4 overlaid in the top-left corner.

Tip: For production testing, replace the bundled test.jpg with an image from your target deployment site showing a representative crowd density.

Expected Output

DLStreamer Sample

The pipeline below runs the FP16 YOLO26 detector on the sample video via gvadetect, filters detections to the person class in a buffer probe using the DLStreamer Python bindings (gstgva.VideoFrame), overlays bounding boxes, saves the annotated result to output_dlstreamer.mp4, and prints the crowd count per frame.

Notes on running this sample:
Use the FP16 IR (yolo26n_openvino_model/yolo26n.xml). On DLStreamer 2026.0.0, gvadetect cannot auto-derive a YOLO post-processor from the INT8 model produced by the bundled script. To use the INT8 model, supply a matching model-proc JSON.

Class names are read automatically from the model's embedded metadata.yaml by DLStreamer 2026.0+ -- no external labels-file is required.

Filtering with object-class=person directly on gvadetect is rejected when inference-region is full-frame (the default), so the sample filters by region.label() in the buffer probe instead.
Export PYTHONPATH so the DLStreamer Python module is importable:
source /opt/intel/openvino_2026/setupvars.sh
source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
export PYTHONPATH=/opt/intel/dlstreamer/python:\
/opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}

import gi

gi.require_version("Gst", "1.0")
gi.require_version("GstVideo", "1.0")
from gi.repository import Gst
from gstgva import VideoFrame

Gst.init(None)

INPUT_VIDEO = "test_video.mp4"

# For CPU: change device=GPU to device=CPU.
# For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
pipeline_str = (
    f"filesrc location={INPUT_VIDEO} ! decodebin3 ! "
    "videoconvert ! "
    "gvadetect model=yolo26n_openvino_model/yolo26n.xml "
    "device=GPU "
    "threshold=0.4 ! queue ! "
    "gvawatermark ! videoconvert ! video/x-raw,format=I420 ! "
    "openh264enc ! h264parse ! "
    "mp4mux ! filesink name=sink location=output_dlstreamer.mp4"
)
pipeline = Gst.parse_launch(pipeline_str)

sink = pipeline.get_by_name("sink")
sink_pad = sink.get_static_pad("sink")


def on_buffer(pad, info):
    buf = info.get_buffer()
    caps = pad.get_current_caps()
    frame = VideoFrame(buf, caps=caps)
    crowd_count = sum(1 for r in frame.regions() if r.label() == "person")
    if crowd_count:
        print(f"Crowd count: {crowd_count}", flush=True)
    return Gst.PadProbeReturn.OK


sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)

pipeline.set_state(Gst.State.PLAYING)
bus = pipeline.get_bus()
bus.timed_pop_filtered(
    Gst.CLOCK_TIME_NONE,
    Gst.MessageType.EOS | Gst.MessageType.ERROR,
)
pipeline.set_state(Gst.State.NULL)

Expected Output

Device targets:

device=GPU -- default in the sample code.
device=CPU -- change device=GPU to device=CPU.
device=NPU -- change device=GPU to device=NPU; use batch-size=1 and nireq=4 for best NPU utilization.

License

References

Downloads last month: -; Downloads are not tracked for this model. How to track

Dataset used to train Intel/crowd-detection

Collection including Intel/crowd-detection

Metro Analytics Catalog

Collection

Metro Analytics Catalog is a curated collection of Edge AI models that are supported by Open Edge Platform. • 8 items • Updated about 19 hours ago