| | --- |
| | license: mit |
| | --- |
| | |
| | # XDoc |
| | ## Introduction |
| |
|
| | XDoc is a unified pre-trained model that deals with different document formats in a single model. With only 36.7% parameters, XDoc achieves comparable or better performance on downstream tasks, which is cost-effective for real-world deployment. |
| |
|
| | [XDoc: Unified Pre-training for Cross-Format Document Understanding](https://arxiv.org/abs/2210.02849) |
| | Jingye Chen, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei, [EMNLP 2022](#) |
| |
|
| | ## Citation |
| |
|
| | If you find XDoc helpful, please cite us: |
| | ``` |
| | @article{chen2022xdoc, |
| | title={XDoc: Unified Pre-training for Cross-Format Document Understanding}, |
| | author={Chen, Jingye and Lv, Tengchao and Cui, Lei and Zhang, Cha and Wei, Furu}, |
| | journal={arXiv preprint arXiv:2210.02849}, |
| | year={2022} |
| | } |
| | ``` |
| |
|