nielsr HF Staff commited on
Commit
ff17ce4
·
verified ·
1 Parent(s): 66776ee

Add relevant info to model card

Browse files

This PR:
- links the model to the relevant paper, [PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks](https://huggingface.co/papers/2503.04065)
- ensures the model can be found at https://huggingface.co/models?pipeline_tag=image-text-to-text&sort=trending
- adds a relevant `library_name`, ensuring the "how to use" button appears on the top right of the model page
- adds a link to the Github code

Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -1,8 +1,15 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
4
  # PP-DocBee2
5
 
 
 
 
 
6
  ## 1. 简介
7
 
8
  PP-DocBee2 是PaddleMIX团队自研的一款专注于文档理解的多模态大模型,在PP-DocBee的基础上,我们进一步优化了基础模型,并引入了新的数据优化方案,提高了数据质量,使用自研[数据合成策略](https://arxiv.org/abs/2503.04065)生成的少量的47万数据便使得PP-DocBee2在中文文档理解任务上表现更佳。在内部业务中文场景类的指标上,PP-DocBee2相较于PP-DocBee提升了约11.4%,同时也高于目前的同规模热门开源和闭源模型。
 
1
  ---
2
  license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: image-text-to-text
5
  ---
6
+
7
  # PP-DocBee2
8
 
9
+ This repository contains the model presented in the paper [PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks](https://huggingface.co/papers/2503.04065).
10
+
11
+ For the codebase, see: this https URL
12
+
13
  ## 1. 简介
14
 
15
  PP-DocBee2 是PaddleMIX团队自研的一款专注于文档理解的多模态大模型,在PP-DocBee的基础上,我们进一步优化了基础模型,并引入了新的数据优化方案,提高了数据质量,使用自研[数据合成策略](https://arxiv.org/abs/2503.04065)生成的少量的47万数据便使得PP-DocBee2在中文文档理解任务上表现更佳。在内部业务中文场景类的指标上,PP-DocBee2相较于PP-DocBee提升了约11.4%,同时也高于目前的同规模热门开源和闭源模型。