nathansut1
/

sbb-binarization-onnx

document-binarization

Model card Files Files and versions

sbb-binarization-onnx / README.md

nathansut1's picture

Upload README.md with huggingface_hub

8dfebfa verified 28 days ago

|

history blame contribute delete

2.08 kB

	---
	license: apache-2.0
	base_model: SBB/sbb_binarization
	tags:
	- document-binarization
	- onnx
	- tensorrt
	- ocr
	library_name: onnxruntime
	---

	# SBB Binarization — ONNX Model

	ONNX conversion of [SBB/sbb_binarization](https://huggingface.co/SBB/sbb_binarization) by the Berlin State Library (Staatsbibliothek zu Berlin), developed as part of the [QURATOR](https://qurator.ai/) project.

	The original model is a UNet + Vision Transformer hybrid that converts scanned document images to black and white for OCR. It works on 448x448 patches.

	## What's here

	- model_convtranspose.onnx — the model, ready to use with ONNX Runtime
	- fix_onnx.py — the script that converts SBB's TensorFlow SavedModel to this ONNX file
	- sample_workflow.py — minimal example showing how to binarize an image
	- example_gpu_pipeline.py — full GPU pipeline using CuPy for production use

	## Quick start

	```bash
	pip install onnxruntime-gpu numpy Pillow
	python3 sample_workflow.py input.jpg output.tif
	```

	## What was changed from the original

	The original TF model doesn't convert cleanly to ONNX for TensorRT. Three things needed fixing:

	1. Reshape batch dimension — tf2onnx hardcoded -2048 instead of -1 for dynamic batching
	2. Resize node attributes — TF-specific modes that TensorRT doesn't support, swapped to standard ONNX equivalents
	3. Resize to ConvTranspose — replaced nearest-neighbor upsampling with equivalent depthwise ConvTranspose ops so TensorRT can compile the model as a single subgraph instead of splitting it into 8+ pieces

	All of this is in `fix_onnx.py`. To reproduce from scratch:

	```bash
	pip install tf2onnx onnx tensorflow
	python3 -m tf2onnx.convert --saved-model path/to/saved_model/2022-08-16 --output model.onnx --opset 17
	python3 fix_onnx.py model.onnx model_convtranspose.onnx
	```

	Output is <0.01% pixel difference from the original TF model.

	## License

	Same as the original — [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0). All credit to the SBB team and the QURATOR project for the model itself.