Handwritten & printed text recognition for Ukrainian
Upload an image โ Get recognized text
๐ Table of Contents
- โจ Highlights
- ๐ Quickstart
- ๐ Model Description
- ๐ผ๏ธ Recognition Examples
- ๐ ๏ธ Tools & Scripts
- ๐ Evaluation
- ๐ Attribution & Citation
โจ Highlights
| Feature | Description |
|---|---|
| Language | Ukrainian (handwritten + printed) |
| Architecture | HTR-ConvText (ResNet-18 + MobileViT), CTC decoding |
| Input | 64ร3072 px, grayscale line images |
| Training | 1.7M samples, SAM, EMA, scan simulation |
| Formats | PyTorch, ONNX, Hugging Face AutoModel |
๐ Quickstart
from transformers import AutoModel, AutoProcessor
processor = AutoProcessor.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
model = AutoModel.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
inputs = processor(images="sample.png", return_tensors="pt")
logits = model(**inputs).logits
text = processor.batch_decode(logits)[0]
print(text)
๐ก Try it now: Open the Gradio demo โ no code required!
๐ Model Description
This repository packages a Ukrainian OCR/ICR model for handwritten and partially printed text with a Hugging Faceโnative API (AutoModel + AutoProcessor).
Architecture
- Backbone: ResNet-18 + MobileViT (MVP), hierarchical ConvText encoder (U-Net-like down/upsampling)
- Decoding: CTC greedy
- Vocabulary: 151 characters (Ukrainian + symbols)
Training Data
| Source | Samples |
|---|---|
| ukrainian-handwriting-synth | Synthetic handwritten lines |
| Ukrainian Handwritten Text | ~37k segmented lines |
| Total | 1,696,499 (Train 90% / Val 5% / Test 5%) |
Training
- 500k iterations, batch 16 + grad accum 4
- SAM optimizer, EMA (decay 0.9999), TCM warmup 40k iters
- Scan simulation & detector-error augmentations
- Hardware: NVIDIA B200 (180GB VRAM)
๐ผ๏ธ Recognition Examples
Real-world inference on scanned Ukrainian documents. GT = ground truth.
๐ ๏ธ Tools & Scripts
| File | Purpose |
|---|---|
prepare_hf_artifacts.py |
Convert .pth checkpoint โ HF artifacts |
export_onnx.py |
Export to ONNX |
validate_parity.py |
OpenCV vs PIL, PyTorch vs ONNX parity checks |
predict.py |
Single-image CLI inference |
Conversion
python prepare_hf_artifacts.py \
--checkpoint-path /path/to/best_CER.pth \
--alphabet-path /path/to/alphabet.json \
--output-dir ./release
ONNX Export
python export_onnx.py --hf-model-dir ./release --output-dir ./onnx
๐ Evaluation
| Split | CER | WER | Notes |
|---|---|---|---|
| real-world (124) | 0.176 | 0.440 | Scanned docs, handwritten + printed |
Micro-averaging, format_string_for_wer normalization.
Comparison with other systems
On the same 124 real-world samples, the finetuned Ukrainian HTR-ConvText model (ukr-htr-convtext) was compared against several visionโlanguage and HTR baselines.
| Model | Samples | CER (%) | WER (%) |
|---|---|---|---|
| mamay | 124 | 40.15 | 75.28 |
| finetuned-cyrillic-trocr | 124 | 46.45 | 78.96 |
| cyrillic-trocr | 124 | 51.92 | 97.93 |
| gpt-4o-mini | 124 | 56.19 | 88.75 |
| hunyuan | 124 | 124.80 | 180.78 |
| ukr-htr-convtext (Ours) | 124 | 17.63 | 44.04 |
Across this evaluation set, the proposed ukr-htr-convtext model more than halves the character error rate relative to the next best system (mamay) and strongly outperforms generic and domain-adapted VLM/HTR baselines.
โ ๏ธ Limitations
- Sensitive to severe blur, low contrast, non-standard page artifacts
- Performance may drop on long lines far from training distribution
- CTC decoding can fail on highly ambiguous character boundaries
๐ Attribution & Citation
This implementation adapts ideas from DAIR-Group/HTR-ConvText. See NOTICE and CITATION.cff for details.
Upstream (HTR-ConvText):
@misc{truc2025htrconvtext,
title={HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition},
author={Pham Thach Thanh Truc and Dang Hoai Nam and Huynh Tong Dang Khoa and Vo Nguyen Le Duy},
year={2025},
eprint={2512.05021},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.05021},
}
This model: See CITATION.cff for full attribution.
๐ License
Apache-2.0. See LICENSE.
- Downloads last month
- 348
Model tree for Valerii02/ukr-htr-convtext
Base model
DAIR-Group/HTR-ConvText
