YOLO26l-OptiQ-6bit

Mixed-precision quantized YOLO26l for Apple Silicon via optiq

This is a mixed-precision quantized version of YOLO26l in MLX format, optimized with mlx-optiq for Apple Silicon inference via yolo-mlx.

Quantization Details

Property	Value
Target BPW	6.0
Achieved BPW	6.00
Layers at 4-bit	16
Layers at 8-bit	174
Original size	100.7 MB
Quantized size	22.9 MB
Compression	4.4x

Benchmark Results (COCO128)

Model	Total Detections	Avg/Image
optiq 6-bit	766	6.0
Original (FP32)	766	6.0

Detection delta: +0 (+0.0%) at 4.4x compression.

Usage

Requires mlx-optiq and yolo-mlx:

pip install mlx-optiq yolo-mlx

from optiq.models.yolo import load_quantized_yolo

model = load_quantized_yolo("mlx-community/YOLO26l-OptiQ-6bit")
results = model.predict("image.jpg")

How optiq Works

optiq measures each conv layer's sensitivity via KL divergence on detection outputs, then assigns optimal per-layer bit-widths using greedy knapsack optimization. Sensitive layers (detection head, feature pyramid) get 8-bit precision while robust backbone layers get 4-bit.

Article

For more details on the methodology and results, see: Not All Layers Are Equal

Credits

Quantization: mlx-optiq
Base model: YOLO26 by Ultralytics
MLX runtime: yolo-mlx
Framework: MLX by Apple

Downloads last month: 52

MLX

Hardware compatibility

Quantized

Model tree for mlx-community/YOLO26l-OptiQ-6bit

Base model

Ultralytics/YOLO26

Finetuned

(40)

this model

Collection including mlx-community/YOLO26l-OptiQ-6bit

YOLO 26

Collection

5 items • Updated Apr 11 • 2