Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,49 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
tags:
|
| 3 |
+
- document-parsing
|
| 4 |
+
- ocr
|
| 5 |
+
- pdf
|
| 6 |
+
- parsebench
|
| 7 |
+
- enterprise-documents
|
| 8 |
license: apache-2.0
|
| 9 |
+
language:
|
| 10 |
+
- en
|
| 11 |
+
- ar
|
| 12 |
---
|
| 13 |
+
|
| 14 |
+
# oi-OCR
|
| 15 |
+
|
| 16 |
+
**oi-OCR** is Open Innovation AI's document-parsing tool. It extracts structured Markdown, layout, tables, and chart data from PDFs for downstream RAG ingestion, agentic workflows, and document understanding tasks.
|
| 17 |
+
|
| 18 |
+
## ParseBench Results (April 2026)
|
| 19 |
+
|
| 20 |
+
| Dimension | Score | Rank on the public leaderboard |
|
| 21 |
+
|---|---:|---|
|
| 22 |
+
| **Charts** | **78.48** | **#1 of 47** |
|
| 23 |
+
| Tables | 87.06 | #9 |
|
| 24 |
+
| Content Faithfulness | 87.24 | #18 |
|
| 25 |
+
| Semantic Formatting | 65.65 | #6 |
|
| 26 |
+
| Visual Grounding | 68.71 | #6 (tied with Reducto) |
|
| 27 |
+
| **Overall (mean of 5)** | **77.43** | **#2 of 47** |
|
| 28 |
+
|
| 29 |
+
Evaluated on the full [ParseBench-Full](https://huggingface.co/datasets/llamaindex/ParseBench) suite — 2,037 single-page PDFs across chart, layout, table, and text groups.
|
| 30 |
+
|
| 31 |
+
**oi-OCR is #1 on the Charts dimension** — ahead of LlamaParse Agentic (78.11), Reducto Agentic (73.40), Google Gemini 3 Flash Thinking High (64.79), Anthropic Opus 4.7 (55.84), and OpenAI GPT-5.5 Reasoning Medium (65.53).
|
| 32 |
+
|
| 33 |
+
On Overall, only LlamaParse Agentic (the benchmark creator) ranks higher.
|
| 34 |
+
|
| 35 |
+
Structured eval data: [`.eval_results/parsebench.yaml`](./.eval_results/parsebench.yaml).
|
| 36 |
+
|
| 37 |
+
## Evaluation methodology
|
| 38 |
+
|
| 39 |
+
- **Benchmark**: [ParseBench-Full](https://huggingface.co/datasets/llamaindex/ParseBench) — 2,037 single-page PDFs from real enterprise documents (insurance, finance, government, scientific, etc.)
|
| 40 |
+
- **Evaluator**: official [`parse-bench`](https://github.com/run-llama/ParseBench) CLI
|
| 41 |
+
- **Scoring mode**: rule-only (`LLAMACLOUD_BENCH_LLM_NORMALIZATION=off`) — stricter than the leaderboard's default judge mode; scores would likely be a few points higher under judge mode
|
| 42 |
+
|
| 43 |
+
## Public leaderboard
|
| 44 |
+
|
| 45 |
+
Full benchmark comparison across all 47 entries: [parsebench.ai](https://www.parsebench.ai/)
|
| 46 |
+
|
| 47 |
+
## About
|
| 48 |
+
|
| 49 |
+
[Open Innovation AI](https://openinnovation.ai/) builds enterprise AI tools for the GCC and beyond, with first-class English and Arabic document support.
|