Sync object-detection from metro-analytics-catalog

Browse files

Files changed (3) hide show

LICENSE +45 -0
README.md +272 -5
export_and_quantize.sh +90 -0

LICENSE CHANGED Viewed

	@@ -0,0 +1,45 @@

+This directory contains two categories of content under different licenses.
+Scripts and Documentation
+-------------------------
+The scripts (export_and_quantize.sh) and documentation (README.md) in this
+directory are original works by Intel Corporation, licensed under the
+MIT License.
+    Copyright (C) Intel Corporation
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to deal
+    in the Software without restriction, including without limitation the rights
+    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+    copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+    The above copyright notice and this permission notice shall be included in
+    all copies or substantial portions of the Software.
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+    THE SOFTWARE.
+YOLO26 Model
+------------
+The YOLO26 model weights and the Ultralytics framework are developed by
+Ultralytics and licensed under the GNU Affero General Public License v3.0
+(AGPL-3.0).
+    Source:  https://github.com/ultralytics/ultralytics
+    License: https://github.com/ultralytics/ultralytics/blob/main/LICENSE
+    Docs:    https://docs.ultralytics.com/models/yolo26/
+Users must comply with the AGPL-3.0 license terms when using, modifying,
+or distributing the YOLO26 model weights or Ultralytics software.
+For commercial licensing options, see https://www.ultralytics.com/license.

README.md CHANGED Viewed

@@ -1,5 +1,272 @@
----
-license: other
-license_name: other
-license_link: LICENSE
----

+# Object Detection
+> **Validated with:** OpenVINO 2026.1.0, NNCF 3.0.0, DLStreamer 2026.0, Ultralytics 8.3.0, Python 3.11+
+| Property | Value |
+|---|---|
+| **Category** | General Object Detection (80-class COCO) |
+| **Base Model** | [YOLO26](https://docs.ultralytics.com/models/yolo26/) (Ultralytics) |
+| **Source Framework** | PyTorch (Ultralytics) |
+| **Supported Precisions** | FP32, FP16, FP16-INT8 |
+| **Inference Engine** | OpenVINO |
+| **Hardware** | CPU, GPU, NPU |
+| **Detected Class(es)** | All 80 COCO classes |
+---
+## Overview
+Object Detection is a Metro Analytics use case that detects and classifies objects across the full 80-class COCO taxonomy (person, vehicle, animal, everyday objects, etc.).
+It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector, quantized to INT8 for efficient inference on Intel hardware.
+Unlike the specialized person or vehicle detectors, this model keeps all 80 classes active, making it suitable for general-purpose scene understanding.
+Typical Metro deployments include:
+- **Scene Understanding** -- identify and classify all objects visible in a camera feed.
+- **Inventory Monitoring** -- detect specific items (bags, suitcases, bottles) on platforms.
+- **Anomaly Detection** -- flag unexpected objects in restricted areas.
+- **Multi-Class Analytics** -- gather statistics across people, vehicles, and other categories.
+Available variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`.
+Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment; larger variants improve recall for small objects.
+---
+## Prerequisites
+- Python 3.11+
+- [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
+- [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)
+Create and activate a Python virtual environment before running the scripts:
+```bash
+python3 -m venv .venv --system-site-packages
+source .venv/bin/activate
+```
+---
+## Getting Started
+### Download and Quantize Model
+Run the provided script to download, export to OpenVINO IR, and optionally quantize:
+```bash
+chmod +x export_and_quantize.sh
+./export_and_quantize.sh yolo26n        # default: FP16
+./export_and_quantize.sh yolo26n FP32   # full-precision
+./export_and_quantize.sh yolo26n INT8   # quantized
+```
+Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`).
+The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is **FP16**.
+The script performs the following steps:
+1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
+2. Downloads a sample test image (`test.jpg`).
+3. Downloads the PyTorch weights and exports to OpenVINO IR.
+4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.
+Output files:
+- `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory.
+- `yolo26n_objdet_int8.xml` / `yolo26n_objdet_int8.bin` -- INT8 quantized model *(only when `INT8` is selected)*.
+#### Precision / Device Compatibility
+| Precision | CPU | GPU | NPU |
+|---|---|---|---|
+| FP32 | Yes | Yes | No |
+| FP16 | Yes | Yes | Yes |
+| INT8 | Yes | Yes | Yes |
+> **Note:** For production accuracy, replace the random calibration tensors in
+> `export_and_quantize.sh` with a representative sample of frames from the
+> target deployment site.
+### OpenVINO Sample
+The sample below runs YOLO26 inference on all 80 COCO classes and prints every detected object with its class name and confidence.
+YOLO26 is end-to-end (NMS-free), so no manual non-maximum suppression is needed.
+Change the `device` string to run on CPU, GPU, or NPU.
+```python
+import cv2
+import numpy as np
+import openvino as ov
+COCO_NAMES = [
+    "person","bicycle","car","motorcycle","airplane","bus","train","truck",
+    "boat","traffic light","fire hydrant","stop sign","parking meter","bench",
+    "bird","cat","dog","horse","sheep","cow","elephant","bear","zebra",
+    "giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee",
+    "skis","snowboard","sports ball","kite","baseball bat","baseball glove",
+    "skateboard","surfboard","tennis racket","bottle","wine glass","cup",
+    "fork","knife","spoon","bowl","banana","apple","sandwich","orange",
+    "broccoli","carrot","hot dog","pizza","donut","cake","chair","couch",
+    "potted plant","bed","dining table","toilet","tv","laptop","mouse",
+    "remote","keyboard","cell phone","microwave","oven","toaster","sink",
+    "refrigerator","book","clock","vase","scissors","teddy bear","hair drier",
+    "toothbrush",
+]
+CONF_THRESHOLD = 0.4
+INPUT_SIZE = 640
+core = ov.Core()
+model = core.read_model("yolo26n_openvino_model/yolo26n.xml")
+# Change device to "GPU" or "NPU" to run on integrated GPU or NPU.
+compiled = core.compile_model(model, "CPU")
+image = cv2.imread("test.jpg")
+h0, w0 = image.shape[:2]
+blob = cv2.resize(image, (INPUT_SIZE, INPUT_SIZE))
+blob = cv2.cvtColor(blob, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
+blob = blob.transpose(2, 0, 1)[np.newaxis, ...]  # NCHW
+# YOLO26 end-to-end output: [1, 300, 6] = [x1, y1, x2, y2, confidence, class_id]
+output = compiled([blob])[compiled.output(0)][0]
+mask = output[:, 4] >= CONF_THRESHOLD
+dets = output[mask]
+sx, sy = w0 / INPUT_SIZE, h0 / INPUT_SIZE
+print(f"Total detections: {len(dets)}")
+colors = np.random.RandomState(42).randint(0, 255, (80, 3)).tolist()
+for det in dets:
+    x1 = int(det[0] * sx)
+    y1 = int(det[1] * sy)
+    x2 = int(det[2] * sx)
+    y2 = int(det[3] * sy)
+    cid = int(det[5])
+    conf = float(det[4])
+    label = f"{COCO_NAMES[cid]} {conf:.2f}"
+    color = colors[cid]
+    cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)
+    cv2.putText(image, label, (x1, y1 - 5),
+                cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
+    print(f"  {label} at ({x1},{y1})-({x2},{y2})")
+cv2.imwrite("output.jpg", image)
+```
+**Device targets:**
+- `"CPU"` -- default, works on all Intel platforms.
+- `"GPU"` -- Intel integrated or discrete GPU.
+- `"NPU"` -- Intel NPU (validate with `benchmark_app -d NPU`).
+### Try It on a Sample Image
+The `export_and_quantize.sh` script downloads `test.jpg` automatically.
+Re-run the OpenVINO sample above.
+The script reads `test.jpg`, prints each detected object to the console, and writes the annotated frame to `output.jpg`.
+Expected console output (representative):
+```text
+Total detections: 5
+  person 0.92 at (49,396)-(236,904)
+  bus 0.92 at (0,229)-(804,744)
+  person 0.91 at (670,393)-(809,880)
+  person 0.90 at (223,403)-(345,862)
+  person 0.50 at (0,553)-(68,869)
+```
+### DLStreamer Sample
+The pipeline below runs the FP16 YOLO26 detector on a single image via
+`gvadetect`, overlays bounding boxes, and prints all detections.
+> **Notes on running this sample:**
+>
+> - Use the FP16 IR (`yolo26n_openvino_model/yolo26n.xml`). Class names are
+>   read automatically from the model's embedded `metadata.yaml` by
+>   DLStreamer 2026.0+ -- no external `labels-file` is required.
+> - Export `PYTHONPATH` so the DLStreamer Python module is importable:
+>
+>   ```bash
+>   source /opt/intel/openvino_2026/setupvars.sh
+>   source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
+>   export PYTHONPATH=/opt/intel/dlstreamer/python:\
+>   /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
+>   ```
+**Image-based quick test** (uses `filesrc` with a single JPEG):
+```python
+import gi
+gi.require_version("Gst", "1.0")
+gi.require_version("GstVideo", "1.0")
+from gi.repository import Gst
+from gstgva import VideoFrame
+Gst.init(None)
+# For GPU: change device=CPU to device=GPU and add vapostproc after decodebin.
+# For NPU: change device=CPU to device=NPU (batch-size=1 recommended).
+pipeline_str = (
+    "filesrc location=test.jpg ! jpegdec ! videoconvert ! "
+    "video/x-raw,format=BGR ! "
+    "gvadetect model=yolo26n_openvino_model/yolo26n.xml "
+    "device=CPU threshold=0.4 ! queue ! "
+    "gvawatermark ! videoconvert ! jpegenc ! filesink name=sink location=output.jpg"
+)
+pipeline = Gst.parse_launch(pipeline_str)
+def on_buffer(pad, info):
+    buf = info.get_buffer()
+    caps = pad.get_current_caps()
+    frame = VideoFrame(buf, caps=caps)
+    for region in frame.regions():
+        print(f"  {region.label()} at ({region.rect().x},{region.rect().y})",
+              flush=True)
+    return Gst.PadProbeReturn.OK
+it = pipeline.iterate_elements()
+while True:
+    ok, elem = it.next()
+    if not ok:
+        break
+    if elem.get_factory() and elem.get_factory().get_name() == "gvawatermark":
+        pad = elem.get_static_pad("src")
+        pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)
+        break
+pipeline.set_state(Gst.State.PLAYING)
+bus = pipeline.get_bus()
+bus.timed_pop_filtered(
+    Gst.CLOCK_TIME_NONE,
+    Gst.MessageType.EOS | Gst.MessageType.ERROR,
+)
+pipeline.set_state(Gst.State.NULL)
+```
+**Device targets:**
+- `device=CPU` -- default in the sample code.
+- `device=GPU` -- add `vapostproc` after `decodebin` for zero-copy color conversion.
+- `device=NPU` -- use `batch-size=1` and `nireq=4` for best NPU utilization.
+---
+## License
+Copyright (C) Intel Corporation. All rights reserved.
+Licensed under the MIT License. See [LICENSE](LICENSE) for details.
+## References
+- [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/)
+- [OpenVINO YOLO26 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov26-optimization/yolov26-object-detection.ipynb)
+- [COCO Dataset](https://cocodataset.org/)
+- [OpenVINO Documentation](https://docs.openvino.ai/)
+- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
+- [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html)

export_and_quantize.sh ADDED Viewed

	@@ -0,0 +1,90 @@

+#!/usr/bin/env bash
+# SPDX-License-Identifier: MIT
+# Copyright (C) Intel Corporation
+#
+# Export a YOLO26 general-purpose object detector to OpenVINO IR.
+# Usage: ./export_and_quantize.sh [MODEL_VARIANT] [PRECISION]
+# Example: ./export_and_quantize.sh yolo26n FP16
+#
+# Supported precisions:
+#   FP32  -- Full-precision floating-point weights
+#   FP16  -- Half-precision floating-point weights (default)
+#   INT8  -- Quantized 8-bit integer weights (requires NNCF)
+#
+# Precision / device compatibility:
+#   | Precision | CPU | GPU | NPU |
+#   |-----------|-----|-----|-----|
+#   | FP32      | Yes | Yes | No  |
+#   | FP16      | Yes | Yes | Yes |
+#   | INT8      | Yes | Yes | Yes |
+set -euo pipefail
+MODEL_NAME="${1:-yolo26n}"
+PRECISION="${2:-FP16}"
+PRECISION="$(echo "${PRECISION}" | tr '[:lower:]' '[:upper:]')"
+if [[ "${PRECISION}" != "FP32" && "${PRECISION}" != "FP16" && "${PRECISION}" != "INT8" ]]; then
+    echo "ERROR: unsupported precision '${PRECISION}'. Choose FP32, FP16, or INT8." >&2
+    exit 1
+fi
+echo "--- Installing dependencies ---"
+if [[ "${PRECISION}" == "INT8" ]]; then
+    pip install -qU "openvino>=2026.0.0" "nncf>=3.0.0" ultralytics
+else
+    pip install -qU "openvino>=2026.0.0" ultralytics
+fi
+echo "--- Downloading sample test image ---"
+if [[ ! -f test.jpg ]]; then
+    wget -q -O test.jpg https://ultralytics.com/images/bus.jpg
+    echo "Downloaded: test.jpg"
+else
+    echo "Already present: test.jpg"
+fi
+if [[ "${PRECISION}" == "FP32" ]]; then
+    HALF_FLAG="False"
+    EXPORT_LABEL="FP32"
+else
+    HALF_FLAG="True"
+    EXPORT_LABEL="FP16"
+fi
+echo "--- Exporting ${MODEL_NAME} to OpenVINO IR (${EXPORT_LABEL}) ---"
+python3 -c "
+from ultralytics import YOLO
+model = YOLO('${MODEL_NAME}.pt')
+model.export(format='openvino', half=${HALF_FLAG}, dynamic=False, imgsz=640)
+print('Export complete: ${MODEL_NAME}_openvino_model/')
+"
+if [[ "${PRECISION}" == "INT8" ]]; then
+    echo "--- Quantizing to INT8 with NNCF ---"
+    python3 -c "
+import nncf
+import openvino as ov
+import numpy as np
+core = ov.Core()
+model = core.read_model('${MODEL_NAME}_openvino_model/${MODEL_NAME}.xml')
+def transform_fn(data_item):
+    return np.random.rand(1, 3, 640, 640).astype(np.float32)
+calibration_dataset = nncf.Dataset(list(range(300)), transform_fn)
+quantized = nncf.quantize(
+    model,
+    calibration_dataset,
+    preset=nncf.QuantizationPreset.MIXED,
+    subset_size=300,
+)
+ov.save_model(quantized, '${MODEL_NAME}_objdet_int8.xml')
+print('Quantization complete: ${MODEL_NAME}_objdet_int8.xml')
+"
+fi
+echo "--- Done ---"