UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters
[Paper] [Code] [ModelScope Demo] [Hugging Face Demo] [Local Demo]
Introduction
UniRec-0.1B is a unified recognition model with only 0.1B parameters, designed for high-accuracy and efficient recognition of plain text (words, lines, paragraphs), mathematical formulas (single-line, multi-line), and mixed content in both Chinese and English.
It addresses structural variability and semantic entanglement by using a hierarchical supervision training strategy and a semantic-decoupled tokenizer. Despite its small size, it achieves performance comparable to or better than much larger vision-language models.
Get Started with UniRec
Dependencies:
- PyTorch version >= 1.13.0
- Python version >= 3.7
conda create -n openocr python==3.9
conda activate openocr
# install gpu version torch >=1.13.0
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia
# or cpu version
conda install pytorch torchvision torchaudio cpuonly -c pytorch
git clone https://github.com/Topdu/OpenOCR.git
Downloading the UniRec Model
cd OpenOCR
pip install -r requirements.txt
# Make sure git-lfs is installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/topdu/unirec-0.1b
Inference
python tools/infer_rec.py --c ./configs/rec/unirec/focalsvtr_ardecoder_unirec.yml --o Global.infer_img=/path/to/img_folder or /path/to/img_file
Local Demo
pip install gradio==4.20.0
python demo_unirec.py
Citation
If you find our method useful for your research, please cite:
@article{du2025unirec,
title={UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters},
author={Yongkun Du and Zhineng Chen and Yazhen Xie and Weikang Bai and Hao Feng and Wei Shi and Yuchen Su and Can Huang and Yu-Gang Jiang},
journal={arXiv preprint arXiv:2512.21095},
year={2025}
}
- Downloads last month
- 31