YOLO26n-OptiQ-6bit

Mixed-precision quantized YOLO26n for Apple Silicon via OptiQ

This is a mixed-precision quantized version of YOLO26n in MLX format, optimized with mlx-optiq for Apple Silicon inference via yolo-mlx.

Quantization Details

Property Value
Target BPW 6.0
Achieved BPW 5.96
Layers at 4-bit 12
Layers at 8-bit 114
Original size 9.9 MB
Quantized size 2.5 MB
Compression 3.9x

Benchmark Results (COCO128)

Model Total Detections Avg/Image
OptiQ 6-bit 548 4.3
Original (FP32) 557 4.4

Detection delta: -9 (-1.6%) at 3.9x compression.

Usage

Requires mlx-optiq and yolo-mlx:

pip install mlx-optiq yolo-mlx
from optiq.models.yolo import load_quantized_yolo

model = load_quantized_yolo("mlx-community/YOLO26n-OptiQ-6bit")
results = model.predict("image.jpg")

How OptiQ Works

OptiQ measures each conv layer's sensitivity via KL divergence on detection outputs, then assigns optimal per-layer bit-widths using greedy knapsack optimization. Sensitive layers (detection head, feature pyramid) get 8-bit precision while robust backbone layers get 4-bit.

Credits

Downloads last month
119
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/YOLO26n-OptiQ-6bit

Finetuned
(27)
this model

Collection including mlx-community/YOLO26n-OptiQ-6bit