nathansut1's picture
Upload README.md with huggingface_hub
8dfebfa verified
---
license: apache-2.0
base_model: SBB/sbb_binarization
tags:
- document-binarization
- onnx
- tensorrt
- ocr
library_name: onnxruntime
---
# SBB Binarization β€” ONNX Model
ONNX conversion of [SBB/sbb_binarization](https://huggingface.co/SBB/sbb_binarization) by the Berlin State Library (Staatsbibliothek zu Berlin), developed as part of the [QURATOR](https://qurator.ai/) project.
The original model is a UNet + Vision Transformer hybrid that converts scanned document images to black and white for OCR. It works on 448x448 patches.
## What's here
- **model_convtranspose.onnx** β€” the model, ready to use with ONNX Runtime
- **fix_onnx.py** β€” the script that converts SBB's TensorFlow SavedModel to this ONNX file
- **sample_workflow.py** β€” minimal example showing how to binarize an image
- **example_gpu_pipeline.py** β€” full GPU pipeline using CuPy for production use
## Quick start
```bash
pip install onnxruntime-gpu numpy Pillow
python3 sample_workflow.py input.jpg output.tif
```
## What was changed from the original
The original TF model doesn't convert cleanly to ONNX for TensorRT. Three things needed fixing:
1. **Reshape batch dimension** β€” tf2onnx hardcoded -2048 instead of -1 for dynamic batching
2. **Resize node attributes** β€” TF-specific modes that TensorRT doesn't support, swapped to standard ONNX equivalents
3. **Resize to ConvTranspose** β€” replaced nearest-neighbor upsampling with equivalent depthwise ConvTranspose ops so TensorRT can compile the model as a single subgraph instead of splitting it into 8+ pieces
All of this is in `fix_onnx.py`. To reproduce from scratch:
```bash
pip install tf2onnx onnx tensorflow
python3 -m tf2onnx.convert --saved-model path/to/saved_model/2022-08-16 --output model.onnx --opset 17
python3 fix_onnx.py model.onnx model_convtranspose.onnx
```
Output is <0.01% pixel difference from the original TF model.
## License
Same as the original β€” [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0). All credit to the SBB team and the QURATOR project for the model itself.