zuminghuang commited on
Commit
0ae7110
·
verified ·
1 Parent(s): 7a08a4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -13,10 +13,20 @@
13
 
14
  # Introduction
15
 
16
- We are excited to release Infinity-Parser2, our latest document understanding model. It sets a new state-of-the-art (SOTA) on olmOCR-Bench with a score of 86.7%, outperforming frontier models like DeepSeek-OCR-2, PaddleOCR-VL-1.5, and dots-mocr.
17
 
18
  ## Key Features
19
 
 
 
 
 
 
 
 
 
 
 
20
  Coming soon...
21
 
22
  # Citation
 
13
 
14
  # Introduction
15
 
16
+ We are excited to release Infinity-Parser2-Pro, our latest flagship document understanding model that achieves a new state-of-the-art on olmOCR-Bench with a score of 86.7%, surpassing frontier models such as DeepSeek-OCR-2, PaddleOCR-VL-1.5, and dots.mocr. Building on our previous model Infinity-Parser-7B, we have significantly enhanced our data engine and multi-task reinforcement learning approach. This enables the model to consolidate robust multi-modal parsing capabilities into a unified architecture, delivering brand-new zero-shot capabilities for diverse real-world business scenarios.
17
 
18
  ## Key Features
19
 
20
+ - Upgraded Data Engine: We have comprehensively enhanced our synthetic data engine to support both fixed-layout and flexible-layout document formats. By generating over 1 million diverse full-text samples covering a wide range of document layouts, combined with a dynamic adaptive sampling strategy, we ensure highly balanced and robust multi-task learning across various document types.
21
+
22
+ - Multi-Task Reinforcement Learning: We designed a novel verifiable reward system to support Joint Reinforcement Learning (RL), enabling seamless and simultaneous co-optimization of multiple complex tasks, including doc2json and doc2markdown.
23
+
24
+ - Breakthrough Parsing Performance: It substantially outperforms our previous 7B model, achieving 86.7% on olmOCR-Bench, surpassing frontier models such as DeepSeek-OCR-2, PaddleOCR-VL-1.5, and dots.mocr.
25
+
26
+ - Inference Acceleration: By adopting the highly efficient MoE architecture, our inference throughput has increased by 21% (from 441 to 534 tokens/sec), reducing deployment latency and costs.
27
+
28
+ # Performance
29
+
30
  Coming soon...
31
 
32
  # Citation