Object Detection
Transformers
ONNX
Chinese
English
document-ai
document-layout-analysis
patent
pdf
hiro
patsnap
Instructions to use PatSnap/Hiro-Layout with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PatSnap/Hiro-Layout with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("object-detection", model="PatSnap/Hiro-Layout")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("PatSnap/Hiro-Layout", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Initial open-source release
Browse files- .gitattributes +0 -2
- README.md +64 -9
- README_zh.md +64 -9
- layout模型benchmark.xlsx +0 -0
.gitattributes
CHANGED
|
@@ -5,5 +5,3 @@
|
|
| 5 |
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 6 |
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 7 |
*.xlsx filter=lfs diff=lfs merge=lfs -text
|
| 8 |
-
python=3.12/lib/python3.12/site-packages/hf_xet/hf_xet.abi3.so filter=lfs diff=lfs merge=lfs -text
|
| 9 |
-
python=3.12/lib/python3.12/site-packages/yaml/_yaml.cpython-312-darwin.so filter=lfs diff=lfs merge=lfs -text
|
|
|
|
| 5 |
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 6 |
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 7 |
*.xlsx filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
README.md
CHANGED
|
@@ -26,16 +26,11 @@ English | [简体中文](README_zh.md)
|
|
| 26 |
|
| 27 |
Hiro-Layout is a document layout analysis model for patent and technical PDF pages. It detects and classifies page regions such as text, titles, headers, footers, tables, formulas, chemical structures, figures, captions, search reports, bibliographies, and other patent-specific layout elements.
|
| 28 |
|
| 29 |
-
This repository is prepared for open release in the same style as PatSnap open model cards such as [Hiro-MOSS-OCR-0.3B](https://huggingface.co/PatSnap/Hiro-MOSS-OCR-0.3B) and [TranslationGPT-1.2](https://huggingface.co/PatSnap/TranslationGPT-1.2).
|
| 30 |
-
|
| 31 |
-
> Release note: model weights, inference code, exact architecture details, and dataset release permissions should be confirmed before publishing.
|
| 32 |
-
|
| 33 |
## Highlights
|
| 34 |
|
| 35 |
- Patent-focused layout understanding: covers common patent PDF regions and patent-specific structures.
|
| 36 |
- Technical document coverage: evaluated on both patent PDFs and NPD PDFs.
|
| 37 |
- Fine-grained taxonomy: 25 layout categories across figure, text, and complex document elements.
|
| 38 |
-
- Open evaluation summary: benchmark results are included in `layout模型benchmark.xlsx` and summarized in [EVALUATION.md](EVALUATION.md).
|
| 39 |
|
| 40 |
## Model Overview
|
| 41 |
|
|
@@ -82,14 +77,74 @@ This repository is prepared for open release in the same style as PatSnap open m
|
|
| 82 |
|
| 83 |
## Benchmarks
|
| 84 |
|
| 85 |
-
|
| 86 |
|
| 87 |
| Benchmark | Labels | Precision | Recall | F1 |
|
| 88 |
| --- | ---: | ---: | ---: | ---: |
|
| 89 |
| Patent PDF | 33,054 | 0.8144 | 0.7711 | 0.7922 |
|
| 90 |
| NPD PDF | 17,769 | 0.7090 | 0.6983 | 0.7036 |
|
| 91 |
|
| 92 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
## Usage
|
| 95 |
|
|
@@ -99,7 +154,7 @@ The current model artifact is an ONNX export:
|
|
| 99 |
layout_model/RT-DETR_25.onnx
|
| 100 |
```
|
| 101 |
|
| 102 |
-
The
|
| 103 |
|
| 104 |
```python
|
| 105 |
import onnxruntime as ort
|
|
@@ -109,7 +164,7 @@ print("inputs:", [i.name for i in session.get_inputs()])
|
|
| 109 |
print("outputs:", [o.name for o in session.get_outputs()])
|
| 110 |
```
|
| 111 |
|
| 112 |
-
|
| 113 |
|
| 114 |
## Repository Files
|
| 115 |
|
|
|
|
| 26 |
|
| 27 |
Hiro-Layout is a document layout analysis model for patent and technical PDF pages. It detects and classifies page regions such as text, titles, headers, footers, tables, formulas, chemical structures, figures, captions, search reports, bibliographies, and other patent-specific layout elements.
|
| 28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
## Highlights
|
| 30 |
|
| 31 |
- Patent-focused layout understanding: covers common patent PDF regions and patent-specific structures.
|
| 32 |
- Technical document coverage: evaluated on both patent PDFs and NPD PDFs.
|
| 33 |
- Fine-grained taxonomy: 25 layout categories across figure, text, and complex document elements.
|
|
|
|
| 34 |
|
| 35 |
## Model Overview
|
| 36 |
|
|
|
|
| 77 |
|
| 78 |
## Benchmarks
|
| 79 |
|
| 80 |
+
Metrics are reported as Precision, Recall, and F1.
|
| 81 |
|
| 82 |
| Benchmark | Labels | Precision | Recall | F1 |
|
| 83 |
| --- | ---: | ---: | ---: | ---: |
|
| 84 |
| Patent PDF | 33,054 | 0.8144 | 0.7711 | 0.7922 |
|
| 85 |
| NPD PDF | 17,769 | 0.7090 | 0.6983 | 0.7036 |
|
| 86 |
|
| 87 |
+
### Patent PDF
|
| 88 |
+
|
| 89 |
+
| # | Group | Abbr. | Class | Chinese | Labels | Precision | Recall | F1 |
|
| 90 |
+
|---:|---|---|---|---|---:|---:|---:|---:|
|
| 91 |
+
| 1 | figure | graph | graph | 图表 | 215 | 0.7611 | 0.8000 | 0.7800 |
|
| 92 |
+
| 2 | figure | draw | drawing | 绘制图 | 420 | 0.8649 | 0.3048 | 0.4507 |
|
| 93 |
+
| 3 | figure | struc | structure diagram | 结构图 | 626 | 0.6579 | 0.8355 | 0.7361 |
|
| 94 |
+
| 4 | figure | photo | photograph | 照片 | 147 | 0.8378 | 0.8435 | 0.8407 |
|
| 95 |
+
| 5 | figure | tab | table | 表格 | 198 | 0.7759 | 0.9091 | 0.8372 |
|
| 96 |
+
| 6 | figure | eqn | math equation | 数学公式 | 399 | 0.7762 | 0.6692 | 0.7187 |
|
| 97 |
+
| 7 | figure | chem | chemical formula | 化学式 | 1,099 | 0.8792 | 0.8944 | 0.8868 |
|
| 98 |
+
| 8 | figure | noise | noise | 噪声 | 1,241 | 0.7025 | 0.7687 | 0.7341 |
|
| 99 |
+
| 9 | text | text | text | 文本 | 17,668 | 0.8182 | 0.8062 | 0.8122 |
|
| 100 |
+
| 10 | text | title | title | 标题 | 601 | 0.9117 | 0.8070 | 0.8561 |
|
| 101 |
+
| 11 | text | sec | section title | 章节标题 | 1,394 | 0.7968 | 0.7088 | 0.7502 |
|
| 102 |
+
| 12 | text | head | page header | 页眉 | 3,074 | 0.8187 | 0.7788 | 0.7983 |
|
| 103 |
+
| 13 | text | foot | page footer | 页脚 | 1,012 | 0.7432 | 0.6433 | 0.6896 |
|
| 104 |
+
| 14 | text | mnote | marginal note | 边注 | 421 | 0.7794 | 0.5202 | 0.6239 |
|
| 105 |
+
| 15 | text | cap | caption | 说明 | 80 | 0.6842 | 0.4875 | 0.5693 |
|
| 106 |
+
| 16 | text | figno | figure number | 编号 | 1,389 | 0.8955 | 0.7466 | 0.8143 |
|
| 107 |
+
| 17 | text | lineno | line number | 行号 | 341 | 0.7759 | 0.6598 | 0.7132 |
|
| 108 |
+
| 18 | text | colno | column number | 栏号 | 449 | 0.6964 | 0.4699 | 0.5612 |
|
| 109 |
+
| 19 | text | seq | sequence | 序列表 | 136 | 0.4430 | 0.2574 | 0.3256 |
|
| 110 |
+
| 20 | complex | figcx | figure complex | 图片组 | 1,416 | 0.8657 | 0.7373 | 0.7963 |
|
| 111 |
+
| 21 | complex | rxn | chemical reaction | 反应式 | 150 | 0.8898 | 0.7000 | 0.7836 |
|
| 112 |
+
| 22 | complex | bib | bibliography | 著录页 | 470 | 0.9615 | 0.7979 | 0.8721 |
|
| 113 |
+
| 23 | complex | srep | search report | 搜索报告 | 106 | 0.9052 | 0.9906 | 0.9459 |
|
| 114 |
+
| 24 | complex | toc | Table of Contents | 目录 | 0 | 0.0000 | 0.0000 | 0.0000 |
|
| 115 |
+
| 25 | complex | ref | reference | 参考文献 | 2 | 0.0000 | 0.0000 | 0.0000 |
|
| 116 |
+
| ALL | | | | | 33,054 | 0.8144 | 0.7711 | 0.7922 |
|
| 117 |
+
|
| 118 |
+
### NPD PDF
|
| 119 |
+
|
| 120 |
+
| # | Group | Abbr. | Class | Chinese | Labels | Precision | Recall | F1 |
|
| 121 |
+
|---:|---|---|---|---|---:|---:|---:|---:|
|
| 122 |
+
| 1 | figure | graph | graph | 图表 | 248 | 0.6838 | 0.6976 | 0.6906 |
|
| 123 |
+
| 2 | figure | draw | drawing | 绘制图 | 9 | 0.0000 | 0.0000 | 0.0000 |
|
| 124 |
+
| 3 | figure | struc | structure diagram | 结构图 | 341 | 0.7454 | 0.7126 | 0.7286 |
|
| 125 |
+
| 4 | figure | photo | photograph | 照片 | 82 | 0.6071 | 0.6220 | 0.6145 |
|
| 126 |
+
| 5 | figure | tab | table | 表格 | 209 | 0.7533 | 0.8182 | 0.7844 |
|
| 127 |
+
| 6 | figure | eqn | math equation | 数学公式 | 298 | 0.6789 | 0.5604 | 0.6140 |
|
| 128 |
+
| 7 | figure | chem | chemical formula | 化学式 | 388 | 0.7324 | 0.8325 | 0.7793 |
|
| 129 |
+
| 8 | figure | noise | noise | 噪声 | 695 | 0.4823 | 0.4302 | 0.4548 |
|
| 130 |
+
| 9 | text | text | text | 文本 | 9,119 | 0.6943 | 0.7625 | 0.7268 |
|
| 131 |
+
| 10 | text | title | title | 标题 | 304 | 0.7130 | 0.5395 | 0.6142 |
|
| 132 |
+
| 11 | text | sec | section title | 章节标题 | 1,539 | 0.7337 | 0.6160 | 0.6697 |
|
| 133 |
+
| 12 | text | head | page header | 页眉 | 1,246 | 0.7464 | 0.7111 | 0.7283 |
|
| 134 |
+
| 13 | text | foot | page footer | 页�� | 1,339 | 0.7711 | 0.6468 | 0.7035 |
|
| 135 |
+
| 14 | text | mnote | marginal note | 边注 | 190 | 0.5714 | 0.2947 | 0.3889 |
|
| 136 |
+
| 15 | text | cap | caption | 说明 | 573 | 0.8711 | 0.5899 | 0.7034 |
|
| 137 |
+
| 16 | text | figno | figure number | 编号 | 149 | 0.6078 | 0.4161 | 0.4940 |
|
| 138 |
+
| 17 | text | lineno | line number | 行号 | 41 | 0.6667 | 0.9268 | 0.7755 |
|
| 139 |
+
| 18 | text | colno | column number | 栏号 | 0 | 0.0000 | 0.0000 | 0.0000 |
|
| 140 |
+
| 19 | text | seq | sequence | 序列表 | 18 | 0.7000 | 0.3889 | 0.5000 |
|
| 141 |
+
| 20 | complex | figcx | figure complex | 图片组 | 734 | 0.7657 | 0.7480 | 0.7567 |
|
| 142 |
+
| 21 | complex | rxn | chemical reaction | 反应式 | 36 | 0.8947 | 0.4722 | 0.6182 |
|
| 143 |
+
| 22 | complex | bib | bibliography | 著录页 | 0 | 0.0000 | 0.0000 | 0.0000 |
|
| 144 |
+
| 23 | complex | srep | search report | 搜索报告 | 3 | 0.4286 | 1.0000 | 0.6000 |
|
| 145 |
+
| 24 | complex | toc | Table of Contents | 目录 | 76 | 0.8475 | 0.6579 | 0.7407 |
|
| 146 |
+
| 25 | complex | ref | reference | 参考文献 | 132 | 0.8148 | 0.3333 | 0.4731 |
|
| 147 |
+
| ALL | | | | | 17,769 | 0.7090 | 0.6983 | 0.7036 |
|
| 148 |
|
| 149 |
## Usage
|
| 150 |
|
|
|
|
| 154 |
layout_model/RT-DETR_25.onnx
|
| 155 |
```
|
| 156 |
|
| 157 |
+
The model can be loaded with ONNXRuntime:
|
| 158 |
|
| 159 |
```python
|
| 160 |
import onnxruntime as ort
|
|
|
|
| 164 |
print("outputs:", [o.name for o in session.get_outputs()])
|
| 165 |
```
|
| 166 |
|
| 167 |
+
Use `labels.json` for the 25-class label mapping.
|
| 168 |
|
| 169 |
## Repository Files
|
| 170 |
|
README_zh.md
CHANGED
|
@@ -26,16 +26,11 @@ library_name: transformers
|
|
| 26 |
|
| 27 |
Hiro-Layout 是一个面向专利和技术 PDF 页面图像的文档版面分析模型,用于检测并分类页面区域,包括正文、标题、页眉、页脚、表格、公式、化学式、图片、图注、搜索报告、著录页、参考文献等专利场景常见版面元素。
|
| 28 |
|
| 29 |
-
本仓库按 PatSnap 已发布开源模型卡的结构准备,例如 [Hiro-MOSS-OCR-0.3B](https://huggingface.co/PatSnap/Hiro-MOSS-OCR-0.3B) 和 [TranslationGPT-1.2](https://huggingface.co/PatSnap/TranslationGPT-1.2)。
|
| 30 |
-
|
| 31 |
-
> 发布前请确认:模型权重、推理代码、准确架构信息、数据集和评测结果是否满足公开发布要求。
|
| 32 |
-
|
| 33 |
## 亮点
|
| 34 |
|
| 35 |
- 面向专利文档:覆盖专利 PDF 中常见的正文、图片、表格、公式、著录页、搜索报告等元素。
|
| 36 |
- 覆盖技术文档:在 Patent PDF 和 NPD PDF 两类数据上评测。
|
| 37 |
- 细粒度类别体系:共 25 个版面类别,覆盖 figure、text、complex 三组元素。
|
| 38 |
-
- 评测结果可追溯:原始评测数据保存在 `layout模型benchmark.xlsx`,详细结果见 [EVALUATION.md](EVALUATION.md)。
|
| 39 |
|
| 40 |
## 模型概览
|
| 41 |
|
|
@@ -82,14 +77,74 @@ Hiro-Layout 是一个面向专利和技术 PDF 页面图像的文档版面分析
|
|
| 82 |
|
| 83 |
## 评测结果
|
| 84 |
|
| 85 |
-
评测
|
| 86 |
|
| 87 |
| 数据集 | 人工标签数 | Precision | Recall | F1 |
|
| 88 |
| --- | ---: | ---: | ---: | ---: |
|
| 89 |
| Patent PDF | 33,054 | 0.8144 | 0.7711 | 0.7922 |
|
| 90 |
| NPD PDF | 17,769 | 0.7090 | 0.6983 | 0.7036 |
|
| 91 |
|
| 92 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
## 使用方式
|
| 95 |
|
|
@@ -99,7 +154,7 @@ Hiro-Layout 是一个面向专利和技术 PDF 页面图像的文档版面分析
|
|
| 99 |
layout_model/RT-DETR_25.onnx
|
| 100 |
```
|
| 101 |
|
| 102 |
-
|
| 103 |
|
| 104 |
```python
|
| 105 |
import onnxruntime as ort
|
|
@@ -109,7 +164,7 @@ print("inputs:", [i.name for i in session.get_inputs()])
|
|
| 109 |
print("outputs:", [o.name for o in session.get_outputs()])
|
| 110 |
```
|
| 111 |
|
| 112 |
-
|
| 113 |
|
| 114 |
## 文件说明
|
| 115 |
|
|
|
|
| 26 |
|
| 27 |
Hiro-Layout 是一个面向专利和技术 PDF 页面图像的文档版面分析模型,用于检测并分类页面区域,包括正文、标题、页眉、页脚、表格、公式、化学式、图片、图注、搜索报告、著录页、参考文献等专利场景常见版面元素。
|
| 28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
## 亮点
|
| 30 |
|
| 31 |
- 面向专利文档:覆盖专利 PDF 中常见的正文、图片、表格、公式、著录页、搜索报告等元素。
|
| 32 |
- 覆盖技术文档:在 Patent PDF 和 NPD PDF 两类数据上评测。
|
| 33 |
- 细粒度类别体系:共 25 个版面类别,覆盖 figure、text、complex 三组元素。
|
|
|
|
| 34 |
|
| 35 |
## 模型概览
|
| 36 |
|
|
|
|
| 77 |
|
| 78 |
## 评测结果
|
| 79 |
|
| 80 |
+
评测指标为 Precision、Recall 和 F1。
|
| 81 |
|
| 82 |
| 数据集 | 人工标签数 | Precision | Recall | F1 |
|
| 83 |
| --- | ---: | ---: | ---: | ---: |
|
| 84 |
| Patent PDF | 33,054 | 0.8144 | 0.7711 | 0.7922 |
|
| 85 |
| NPD PDF | 17,769 | 0.7090 | 0.6983 | 0.7036 |
|
| 86 |
|
| 87 |
+
### Patent PDF
|
| 88 |
+
|
| 89 |
+
| # | 大类 | 缩写 | 类别全称 | 中文名 | 人工标签数 | Precision | Recall | F1 |
|
| 90 |
+
|---:|---|---|---|---|---:|---:|---:|---:|
|
| 91 |
+
| 1 | figure | graph | graph | 图表 | 215 | 0.7611 | 0.8000 | 0.7800 |
|
| 92 |
+
| 2 | figure | draw | drawing | 绘制图 | 420 | 0.8649 | 0.3048 | 0.4507 |
|
| 93 |
+
| 3 | figure | struc | structure diagram | 结构图 | 626 | 0.6579 | 0.8355 | 0.7361 |
|
| 94 |
+
| 4 | figure | photo | photograph | 照片 | 147 | 0.8378 | 0.8435 | 0.8407 |
|
| 95 |
+
| 5 | figure | tab | table | 表格 | 198 | 0.7759 | 0.9091 | 0.8372 |
|
| 96 |
+
| 6 | figure | eqn | math equation | 数学公式 | 399 | 0.7762 | 0.6692 | 0.7187 |
|
| 97 |
+
| 7 | figure | chem | chemical formula | 化学式 | 1,099 | 0.8792 | 0.8944 | 0.8868 |
|
| 98 |
+
| 8 | figure | noise | noise | 噪声 | 1,241 | 0.7025 | 0.7687 | 0.7341 |
|
| 99 |
+
| 9 | text | text | text | 文本 | 17,668 | 0.8182 | 0.8062 | 0.8122 |
|
| 100 |
+
| 10 | text | title | title | 标题 | 601 | 0.9117 | 0.8070 | 0.8561 |
|
| 101 |
+
| 11 | text | sec | section title | 章节标题 | 1,394 | 0.7968 | 0.7088 | 0.7502 |
|
| 102 |
+
| 12 | text | head | page header | 页眉 | 3,074 | 0.8187 | 0.7788 | 0.7983 |
|
| 103 |
+
| 13 | text | foot | page footer | 页脚 | 1,012 | 0.7432 | 0.6433 | 0.6896 |
|
| 104 |
+
| 14 | text | mnote | marginal note | 边注 | 421 | 0.7794 | 0.5202 | 0.6239 |
|
| 105 |
+
| 15 | text | cap | caption | 说明 | 80 | 0.6842 | 0.4875 | 0.5693 |
|
| 106 |
+
| 16 | text | figno | figure number | 编号 | 1,389 | 0.8955 | 0.7466 | 0.8143 |
|
| 107 |
+
| 17 | text | lineno | line number | 行号 | 341 | 0.7759 | 0.6598 | 0.7132 |
|
| 108 |
+
| 18 | text | colno | column number | 栏号 | 449 | 0.6964 | 0.4699 | 0.5612 |
|
| 109 |
+
| 19 | text | seq | sequence | 序列表 | 136 | 0.4430 | 0.2574 | 0.3256 |
|
| 110 |
+
| 20 | complex | figcx | figure complex | 图片组 | 1,416 | 0.8657 | 0.7373 | 0.7963 |
|
| 111 |
+
| 21 | complex | rxn | chemical reaction | 反应式 | 150 | 0.8898 | 0.7000 | 0.7836 |
|
| 112 |
+
| 22 | complex | bib | bibliography | 著录页 | 470 | 0.9615 | 0.7979 | 0.8721 |
|
| 113 |
+
| 23 | complex | srep | search report | 搜索报告 | 106 | 0.9052 | 0.9906 | 0.9459 |
|
| 114 |
+
| 24 | complex | toc | Table of Contents | 目录 | 0 | 0.0000 | 0.0000 | 0.0000 |
|
| 115 |
+
| 25 | complex | ref | reference | 参考文献 | 2 | 0.0000 | 0.0000 | 0.0000 |
|
| 116 |
+
| ALL | | | | | 33,054 | 0.8144 | 0.7711 | 0.7922 |
|
| 117 |
+
|
| 118 |
+
### NPD PDF
|
| 119 |
+
|
| 120 |
+
| # | 大类 | 缩写 | 类别全称 | 中文名 | 人工标签数 | Precision | Recall | F1 |
|
| 121 |
+
|---:|---|---|---|---|---:|---:|---:|---:|
|
| 122 |
+
| 1 | figure | graph | graph | 图表 | 248 | 0.6838 | 0.6976 | 0.6906 |
|
| 123 |
+
| 2 | figure | draw | drawing | 绘制图 | 9 | 0.0000 | 0.0000 | 0.0000 |
|
| 124 |
+
| 3 | figure | struc | structure diagram | 结构图 | 341 | 0.7454 | 0.7126 | 0.7286 |
|
| 125 |
+
| 4 | figure | photo | photograph | 照片 | 82 | 0.6071 | 0.6220 | 0.6145 |
|
| 126 |
+
| 5 | figure | tab | table | 表格 | 209 | 0.7533 | 0.8182 | 0.7844 |
|
| 127 |
+
| 6 | figure | eqn | math equation | 数学公式 | 298 | 0.6789 | 0.5604 | 0.6140 |
|
| 128 |
+
| 7 | figure | chem | chemical formula | 化学式 | 388 | 0.7324 | 0.8325 | 0.7793 |
|
| 129 |
+
| 8 | figure | noise | noise | 噪声 | 695 | 0.4823 | 0.4302 | 0.4548 |
|
| 130 |
+
| 9 | text | text | text | 文本 | 9,119 | 0.6943 | 0.7625 | 0.7268 |
|
| 131 |
+
| 10 | text | title | title | 标题 | 304 | 0.7130 | 0.5395 | 0.6142 |
|
| 132 |
+
| 11 | text | sec | section title | 章节标题 | 1,539 | 0.7337 | 0.6160 | 0.6697 |
|
| 133 |
+
| 12 | text | head | page header | 页眉 | 1,246 | 0.7464 | 0.7111 | 0.7283 |
|
| 134 |
+
| 13 | text | foot | page footer | 页脚 | 1,339 | 0.7711 | 0.6468 | 0.7035 |
|
| 135 |
+
| 14 | text | mnote | marginal note | 边注 | 190 | 0.5714 | 0.2947 | 0.3889 |
|
| 136 |
+
| 15 | text | cap | caption | 说明 | 573 | 0.8711 | 0.5899 | 0.7034 |
|
| 137 |
+
| 16 | text | figno | figure number | 编号 | 149 | 0.6078 | 0.4161 | 0.4940 |
|
| 138 |
+
| 17 | text | lineno | line number | 行号 | 41 | 0.6667 | 0.9268 | 0.7755 |
|
| 139 |
+
| 18 | text | colno | column number | 栏号 | 0 | 0.0000 | 0.0000 | 0.0000 |
|
| 140 |
+
| 19 | text | seq | sequence | 序列表 | 18 | 0.7000 | 0.3889 | 0.5000 |
|
| 141 |
+
| 20 | complex | figcx | figure complex | 图片组 | 734 | 0.7657 | 0.7480 | 0.7567 |
|
| 142 |
+
| 21 | complex | rxn | chemical reaction | 反应式 | 36 | 0.8947 | 0.4722 | 0.6182 |
|
| 143 |
+
| 22 | complex | bib | bibliography | 著录页 | 0 | 0.0000 | 0.0000 | 0.0000 |
|
| 144 |
+
| 23 | complex | srep | search report | 搜索报告 | 3 | 0.4286 | 1.0000 | 0.6000 |
|
| 145 |
+
| 24 | complex | toc | Table of Contents | 目录 | 76 | 0.8475 | 0.6579 | 0.7407 |
|
| 146 |
+
| 25 | complex | ref | reference | 参考文献 | 132 | 0.8148 | 0.3333 | 0.4731 |
|
| 147 |
+
| ALL | | | | | 17,769 | 0.7090 | 0.6983 | 0.7036 |
|
| 148 |
|
| 149 |
## 使用方式
|
| 150 |
|
|
|
|
| 154 |
layout_model/RT-DETR_25.onnx
|
| 155 |
```
|
| 156 |
|
| 157 |
+
模型可使用 ONNXRuntime 加载:
|
| 158 |
|
| 159 |
```python
|
| 160 |
import onnxruntime as ort
|
|
|
|
| 164 |
print("outputs:", [o.name for o in session.get_outputs()])
|
| 165 |
```
|
| 166 |
|
| 167 |
+
25 类标签映射见 `labels.json`。
|
| 168 |
|
| 169 |
## 文件说明
|
| 170 |
|
layout模型benchmark.xlsx
CHANGED
|
Binary files "a/layout\346\250\241\345\236\213benchmark.xlsx" and "b/layout\346\250\241\345\236\213benchmark.xlsx" differ
|
|
|