RepViT-SAM: Towards Real-Time Segmenting Anything
Paper
•
2312.05760
•
Published
This model is a lightweight OCR model build for speed and optimized for mobile/edge devices.
It achives high-accuracy text recognition while maintaining a footprint much smaller than standard models.
Check out the technical docs for more details. Source code will soon be available at the GitHub repo
PaddleOCR-VL-For-Manga, which has a ~10% CER and ~70% exact-match accuracyThis project was done with the usage of:
The model builds upon kha-white/manga-ocr, with a significant divergence in deployment focus and data generation.
@inproceedings{wang2024repvit,
title={Repvit: Revisiting mobile cnn from vit perspective},
author={Wang, Ao and Chen, Hui and Lin, Zijia and Han, Jungong and Ding, Guiguang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={15909--15920},
year={2024}
}
@misc{wang2023repvitsam,
title={RepViT-SAM: Towards Real-Time Segmenting Anything},
author={Ao Wang and Hui Chen and Zijia Lin and Jungong Han and Guiguang Ding},
year={2023},
eprint={2312.05760},
archivePrefix={arXiv},
primaryClass={cs.CV}
}