docling-layout-heron-101 โ€” MLX (bfloat16)

MLX-converted weights of docling-project/docling-layout-heron-101, the larger (ResNet-101-vd backbone) variant of the Docling layout model. Apache-2.0, same as upstream.

The architecture is RT-DETRv2 with a ResNet-101-vd backbone (depths=[3, 4, 23, 3]), 300 queries, and 17 layout classes. The smaller ResNet-50 sibling is at mlx-community/docling-layout-heron-mlx-bf16.

Inference

Requires the RT-DETRv2 MLX port in mlx-vlm.

from pathlib import Path
from PIL import Image
from huggingface_hub import snapshot_download
from transformers import AutoProcessor
from mlx_vlm.utils import load_model
from mlx_vlm.models.rt_detr_v2.generate import RTDetrV2Predictor
import mlx_vlm.models.rt_detr_v2  # registers the processor with AutoProcessor

path = Path(snapshot_download("mlx-community/docling-layout-heron-101-mlx-bf16"))
model = load_model(path)
processor = AutoProcessor.from_pretrained(path)
predictor = RTDetrV2Predictor(model, processor, threshold=0.3)

result = predictor.predict(Image.open("page.png"))
for name, score, box in zip(result.class_names, result.scores, result.boxes):
    print(f"{name:20s} {score:.3f} {box.tolist()}")

result is a DetectionResult with vectorized fields: boxes (N, 4) xyxy in original-image pixels, scores (N,), labels (N,) integer class ids, and class_names.

Conversion

Produced with:

python -m mlx_vlm.models.rt_detr_v2.convert \
    --hf-path docling-project/docling-layout-heron-101 \
    --output ./docling-layout-heron-101-mlx-bf16 \
    --dtype bfloat16

Numerical validation against transformers.RTDetrV2ForObjectDetection on real document inputs: max abs error ~2e-5 on logits, sub-pixel on bboxes.

License and citation

Apache-2.0. The original work is described in "Advanced Layout Analysis Models for Docling" by Livathinos et al. (arXiv:2509.11720); please cite the upstream paper if you use this model.

Downloads last month
24
Safetensors
Model size
76.7M params
Tensor type
BF16
ยท
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mlx-community/docling-layout-heron-101-mlx-bf16

Quantized
(2)
this model

Paper for mlx-community/docling-layout-heron-101-mlx-bf16