UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters

[Paper] [Code] [ModelScope Demo] [Hugging Face Demo] [Local Demo]

Introduction

UniRec-0.1B is a unified recognition model with only 0.1B parameters, designed for high-accuracy and efficient recognition of plain text (words, lines, paragraphs), mathematical formulas (single-line, multi-line), and mixed content in both Chinese and English.

It addresses structural variability and semantic entanglement by using a hierarchical supervision training strategy and a semantic-decoupled tokenizer. Despite its small size, it achieves performance comparable to or better than much larger vision-language models.

Get Started with UniRec

Dependencies:

PyTorch version >= 1.13.0
Python version >= 3.7

conda create -n openocr python==3.9
conda activate openocr
# install gpu version torch >=1.13.0
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia
# or cpu version
conda install pytorch torchvision torchaudio cpuonly -c pytorch
git clone https://github.com/Topdu/OpenOCR.git

Downloading the UniRec Model

cd OpenOCR
pip install -r requirements.txt
# Make sure git-lfs is installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/topdu/unirec-0.1b

Inference

python tools/infer_rec.py --c ./configs/rec/unirec/focalsvtr_ardecoder_unirec.yml --o Global.infer_img=/path/to/img_folder or /path/to/img_file

Local Demo

pip install gradio==4.20.0
python demo_unirec.py

Citation

If you find our method useful for your research, please cite:

@article{du2025unirec,
  title={UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters},
  author={Yongkun Du and Zhineng Chen and Yazhen Xie and Weikang Bai and Hao Feng and Wei Shi and Yuchen Su and Can Huang and Yu-Gang Jiang},
  journal={arXiv preprint arXiv:2512.21095},
  year={2025}
}

Downloads last month: 31