# ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian OCR / ICR (HTR-ConvText)

Handwritten & printed text recognition for Ukrainian

Live Demo

Upload an image โ†’ Get recognized text

English ยท ะฃะบั€ะฐั—ะฝััŒะบะฐ

๐Ÿ“‹ Table of Contents

โœจ Highlights

Feature Description
Language Ukrainian (handwritten + printed)
Architecture HTR-ConvText (ResNet-18 + MobileViT), CTC decoding
Input 64ร—3072 px, grayscale line images
Training 1.7M samples, SAM, EMA, scan simulation
Formats PyTorch, ONNX, Hugging Face AutoModel

๐Ÿš€ Quickstart

from transformers import AutoModel, AutoProcessor

processor = AutoProcessor.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
model = AutoModel.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
inputs = processor(images="sample.png", return_tensors="pt")
logits = model(**inputs).logits
text = processor.batch_decode(logits)[0]
print(text)

๐Ÿ’ก Try it now: Open the Gradio demo โ€” no code required!

๐Ÿ“– Model Description

This repository packages a Ukrainian OCR/ICR model for handwritten and partially printed text with a Hugging Faceโ€“native API (AutoModel + AutoProcessor).

Architecture

  • Backbone: ResNet-18 + MobileViT (MVP), hierarchical ConvText encoder (U-Net-like down/upsampling)
  • Decoding: CTC greedy
  • Vocabulary: 151 characters (Ukrainian + symbols)

Training Data

Source Samples
ukrainian-handwriting-synth Synthetic handwritten lines
Ukrainian Handwritten Text ~37k segmented lines
Total 1,696,499 (Train 90% / Val 5% / Test 5%)

Training

  • 500k iterations, batch 16 + grad accum 4
  • SAM optimizer, EMA (decay 0.9999), TCM warmup 40k iters
  • Scan simulation & detector-error augmentations
  • Hardware: NVIDIA B200 (180GB VRAM)

๐Ÿ–ผ๏ธ Recognition Examples

Example Image GT Prediction CER WER
1 example_1 ะ”ะตะฟะฐั€ั‚ะฐะผะตะฝั‚ัƒ ะฟะฐั‚ั€ัƒะปัŒะฝะพั— ะฟะพะปั–ั†ั–ั— ะ”ะตะฟะฐั€ั‚ะฐะผะตะฝั‚ัƒ ะฝะฐะณั€ัƒะปัŒะฝะพั— ะฟะพะปั–ั†ั–ั— 0.065 0.33
2 example_2 ะทะฐ ะฟะพั€ัƒัˆะตะฝะฝั ะฟั€ะฐะฒะธะป ะดะพั€ะพะถะฝัŒะพะณะพ ั€ัƒั…ัƒ ะทะฐ ะฟะพั€ัƒัˆะตะฝะฝั ะฟั€ะฐะฒะธะป ะดะพั€ะพะถะฝัŒะพะณะพ ะ”ัƒะบัƒ 0.057 0.20

Real-world inference on scanned Ukrainian documents. GT = ground truth.

๐Ÿ› ๏ธ Tools & Scripts

File Purpose
prepare_hf_artifacts.py Convert .pth checkpoint โ†’ HF artifacts
export_onnx.py Export to ONNX
validate_parity.py OpenCV vs PIL, PyTorch vs ONNX parity checks
predict.py Single-image CLI inference

Conversion

python prepare_hf_artifacts.py \
  --checkpoint-path /path/to/best_CER.pth \
  --alphabet-path /path/to/alphabet.json \
  --output-dir ./release

ONNX Export

python export_onnx.py --hf-model-dir ./release --output-dir ./onnx

๐Ÿ“Š Evaluation

Split CER WER Notes
real-world (124) 0.176 0.440 Scanned docs, handwritten + printed

Micro-averaging, format_string_for_wer normalization.

Comparison with other systems

On the same 124 real-world samples, the finetuned Ukrainian HTR-ConvText model (ukr-htr-convtext) was compared against several visionโ€“language and HTR baselines.

Model Samples CER (%) WER (%)
mamay 124 40.15 75.28
finetuned-cyrillic-trocr 124 46.45 78.96
cyrillic-trocr 124 51.92 97.93
gpt-4o-mini 124 56.19 88.75
hunyuan 124 124.80 180.78
ukr-htr-convtext (Ours) 124 17.63 44.04

Across this evaluation set, the proposed ukr-htr-convtext model more than halves the character error rate relative to the next best system (mamay) and strongly outperforms generic and domain-adapted VLM/HTR baselines.

โš ๏ธ Limitations

  • Sensitive to severe blur, low contrast, non-standard page artifacts
  • Performance may drop on long lines far from training distribution
  • CTC decoding can fail on highly ambiguous character boundaries

๐Ÿ™ Attribution & Citation

This implementation adapts ideas from DAIR-Group/HTR-ConvText. See NOTICE and CITATION.cff for details. Upstream (HTR-ConvText):

@misc{truc2025htrconvtext,
  title={HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition},
  author={Pham Thach Thanh Truc and Dang Hoai Nam and Huynh Tong Dang Khoa and Vo Nguyen Le Duy},
  year={2025},
  eprint={2512.05021},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2512.05021},
}

This model: See CITATION.cff for full attribution.

๐Ÿ“„ License

Apache-2.0. See LICENSE.

โญ Star this repo if you find it useful! ยท Report issues ยท Contributions welcome
Downloads last month
348
Safetensors
Model size
66M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Valerii02/ukr-htr-convtext

Finetuned
(1)
this model

Space using Valerii02/ukr-htr-convtext 1

Paper for Valerii02/ukr-htr-convtext