Update README.md
Browse files
README.md
CHANGED
|
@@ -13,6 +13,50 @@ tags:
|
|
| 13 |
|
| 14 |
Dense early-fusion vision-language model for **document OCR**. Given a document image, it extracts text, tables, formulas, and other elements as plain text.
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
## Installation
|
| 17 |
|
| 18 |
```bash
|
|
|
|
| 13 |
|
| 14 |
Dense early-fusion vision-language model for **document OCR**. Given a document image, it extracts text, tables, formulas, and other elements as plain text.
|
| 15 |
|
| 16 |
+
## Highlights
|
| 17 |
+
|
| 18 |
+
Despite our model's compact 300M-parameter architecture, it achieves near state-of-the-art (SOTA) performance across major benchmarks.
|
| 19 |
+
|
| 20 |
+
1. **Strong Performance:** FalconOCR achieves near-SOTA results on both **olmOCR** and **OmniDocBench**, delivering competitive accuracy for text, tables, and formula recognition against models many times its size.
|
| 21 |
+
2. **Two-Stage Layout Pipeline:** FalconOCR pairs with [PP-DocLayoutV3](https://huggingface.co/PaddlePaddle/PP-DocLayoutV3_safetensors) for layout detection, enabling accurate region-level parsing of complex documents with mixed content types while preserving reading order.
|
| 22 |
+
3. **Simple and Lightweight Architecture:** Built on a compact 300M-parameter vision-language model, FalconOCR offers a streamlined alternative to bulky multi-model pipelines. Task switching is handled simply by changing the input prompt.
|
| 23 |
+
4. **Efficient and Fast Inference:** FalconOCR's small footprint enables fast inference out of the box, with an optional [vLLM](https://github.com/vllm-project/vllm) backend for high-throughput production deployments.
|
| 24 |
+
|
| 25 |
+
## Benchmark Results
|
| 26 |
+
|
| 27 |
+
### olmOCR Benchmark
|
| 28 |
+
|
| 29 |
+
Category-wise performance comparison of FalconOCR against state-of-the-art OCR models. We report accuracy (%) across all category splits.
|
| 30 |
+
|
| 31 |
+
| Model | Average | ArXiv Math | Base | Hdr/Ftr | TinyTxt | MultCol | OldScan | OldMath | Tables |
|
| 32 |
+
|---|---|---|---|---|---|---|---|---|---|
|
| 33 |
+
| Mistral OCR 3 | 81.7 | 85.4 | 99.9 | 93.8 | 88.9 | 82.1 | 48.8 | 68.3 | 86.1 |
|
| 34 |
+
| Chandra | 82.0 | 81.4 | 99.8 | 88.8 | 91.9 | 82.9 | 49.2 | 73.6 | 88.2 |
|
| 35 |
+
| Gemini 3 Pro | 80.2 | 70.6 | 99.8 | 84.0 | 90.3 | 79.2 | 47.5 | 84.9 | 84.9 |
|
| 36 |
+
| PaddleOCR VL 1.5 | 79.3 | 85.4 | 98.8 | 96.9 | 80.8 | 82.6 | 39.2 | 66.4 | 84.1 |
|
| 37 |
+
| PaddleOCR VL | 79.2 | 85.4 | 98.6 | 96.9 | 80.8 | 82.5 | 38.8 | 66.4 | 83.9 |
|
| 38 |
+
| DeepSeek OCR v2 | 78.8 | 81.9 | 99.8 | 95.6 | 88.7 | 83.6 | 33.7 | 68.8 | 78.1 |
|
| 39 |
+
| Gemini 3 Flash | 77.5 | 66.5 | 99.8 | 83.8 | 88.2 | 73.7 | 46.0 | 85.8 | 75.9 |
|
| 40 |
+
| GPT 5.2 | 69.8 | 61.0 | 99.8 | 75.6 | 62.2 | 70.2 | 34.6 | 75.8 | 79.0 |
|
| 41 |
+
| **FalconOCR** | **80.3** | **80.9** | **99.5** | **94.2** | **78.3** | **87.3** | **43.5** | **70.1** | **90.1** |
|
| 42 |
+
|
| 43 |
+
### OmniDocBench
|
| 44 |
+
|
| 45 |
+
Performance comparison on full-page document parsing. Overall↑ aggregates the three sub-metrics. Edit↓ measures text edit distance (lower is better). CDM↑ evaluates formula recognition accuracy. TEDS↑ measures table structure similarity.
|
| 46 |
+
|
| 47 |
+
| Model | Overall↑ | Edit↓ | CDM↑ | TEDS↑ |
|
| 48 |
+
|---|---|---|---|---|
|
| 49 |
+
| PaddleOCR VL 1.5 | 94.37 | 0.075 | 94.4 | 91.1 |
|
| 50 |
+
| PaddleOCR VL | 91.76 | 0.024 | 91.7 | 85.9 |
|
| 51 |
+
| Chandra | 88.97 | 0.046 | 88.1 | 89.5 |
|
| 52 |
+
| DeepSeek OCR v2 | 87.66 | 0.037 | 89.2 | 77.5 |
|
| 53 |
+
| GPT 5.2 | 86.56 | 0.061 | 88.0 | 77.7 |
|
| 54 |
+
| Mistral OCR 3 | 85.20 | 0.053 | 84.3 | 76.1 |
|
| 55 |
+
| **FalconOCR** | **88.64** | **0.055** | **86.8** | **84.6** |
|
| 56 |
+
|
| 57 |
+
## Citation
|
| 58 |
+
|
| 59 |
+
|
| 60 |
## Installation
|
| 61 |
|
| 62 |
```bash
|