--- license: agpl-3.0 library_name: onnx tags: - yolo - yolov11 - object-detection - instance-segmentation - onnx - tensorrt pipeline_tag: image-segmentation --- # occurra/object_detection_segmentation ONNX exports of [Ultralytics YOLOv11-seg](https://github.com/ultralytics/ultralytics) (instance segmentation) in the configurations the occurra `object_detection_segmentation` agent ships with. Companion to [`occurra/object_detection`](https://huggingface.co/occurra/object_detection) — same class set (person + bicycle + 4 vehicle subtypes), same naming convention, same hardware-selection logic, with per-object pixel masks on top of bounding boxes. Nano size only (no small variant yet). Four precision variants. All files are self-contained (no external-data sidecars). ## Filename convention ``` yolo11n-seg_{apple,fp16,fp8,int8}_640x640.onnx ``` | Token | Meaning | | ----- | ------- | | `n-seg` | YOLOv11 nano segmentation variant | | `apple` | FP16, NMS-free, batch=1, static — CoreML / Apple ANE friendly. uint8 input. | | `fp16` | FP16 weights, NMS embedded. Default for NVIDIA `TensorRT` EP. | | `fp8` | FP8 quantized via TensorRT QDQ. Smallest VRAM footprint on Blackwell / Hopper. | | `int8` | INT8 quantized with QDQ nodes embedded in the graph. No sidecar calibration cache needed. | | `640x640` | Square input — same shape used by the upstream Ultralytics export. | The `object_detection_segmentation` agent reads the input shape directly from the loaded ONNX (`graph.input[0].type`) — no sidecar config; the file name is informational. ## Which file to pick | Hardware | Recommended | | -------- | ----------- | | Apple Silicon (CoreML / ANE) | `yolo11n-seg_apple_640x640.onnx` | | NVIDIA RTX 4000+ / Blackwell | `yolo11n-seg_fp8_640x640.onnx` | | NVIDIA older (no FP8) | `yolo11n-seg_int8_640x640.onnx` | | CPU fallback | `yolo11n-seg_fp16_640x640.onnx` | The agent's `_resolve_model_filename` picks automatically based on platform + GPU compute capability. Set `OBJECT_DETECTION_SEGMENTATION_MODEL=` to force a specific variant. ## Outputs Each ONNX has two outputs (Ultralytics-seg standard): | Output | Shape | Contents | | ------ | ----- | -------- | | `output0` | `(batch, 4+80+32, N)` | `[cx, cy, w, h]` + 80 class scores + 32 mask coefficients per anchor | | `output1` | `(batch, 32, proto_h, proto_w)` | Prototype masks; `coeffs @ protos` reconstructs the per-detection mask. | The agent runs NMS in Python after filtering to the curated class set (COCO 0/1/2/3/5/7 → person, bicycle, car, motorcycle, bus, truck) and decodes masks in `YoloSegOnnx`. Bitplane bytes are passed to the C++ toolbox for denoising + RLE encoding. ## Source Ultralytics `yolo11n-seg.pt` checkpoints downloaded from Ultralytics' release feed and re-exported via the occurra toolbox's `ai_agent_toolbox/agents/python/object_detection_segmentation/scripts/main.py` (NMS-free for Apple, with-NMS for NVIDIA; FP8/INT8 use TensorRT QDQ). ## License The model weights inherit Ultralytics YOLOv11's [AGPL-3.0](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) license. Commercial use requires a separate enterprise license from Ultralytics — the ONNX export does not change that.