rb-docling-onnx
Pre-converted ONNX weights of Docling models for use with rb_docling (Ruby port).
These are third-party conversions of the official Docling PyTorch checkpoints.
For the canonical PyTorch weights see ds4sd/docling-models
and ds4sd/docling-layout-heron-101.
Files
| File | Source | Description |
|---|---|---|
layout.onnx |
ds4sd/docling-layout-heron-101 |
RT-DETR layout detector (DocLayNet, 11 classes) |
tableformer_encoder.onnx |
ds4sd/docling-models (tableformer/accurate) |
TableFormer encoder, image โ memory |
tableformer_decoder.onnx |
ds4sd/docling-models (tableformer/accurate) |
TableFormer autoregressive decoder, (tokens, memory) โ (next_token, bbox) |
tableformer_vocab.json |
derived | OTSL vocabulary for the decoder |
Usage from Ruby
# Gemfile
gem "rb_docling", "~> 0.1"
# Scarica i modelli ONNX di Docling
bundle exec rake models:fetch
require "rb_docling"
tree = RbDocling.parse("doc.pdf",
layout: :onnx, table: :onnx,
models_dir: "./models")
puts tree.to_md
Conversion details
| Source PyTorch version | torch 2.x |
| ONNX opset | 17 |
| Quantization | none (FP32) |
| Conversion scripts | tools/ in the rb_docling repo |
To re-build from the official PyTorch weights:
git clone https://github.com/scinoky/rb_docling
cd rb_docling
pip install -r tools/requirements.txt
python tools/export_layout.py
python tools/export_tableformer.py --variant accurate --mode split
License
Apache-2.0, inherited from the upstream Docling models.
Citation
If you use these conversions, please cite the original Docling work:
@article{Docling,
title = {Docling Technical Report},
author = {Auer, Christoph and Lysak, Maksym and others (IBM Research)},
journal = {arXiv preprint},
year = {2024}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for scinoky/rb_docling-onnx
Base model
docling-project/docling-layout-heron-101