PP-DocLayout_plus-L / README.md
CT2534's picture
Duplicate from PaddlePaddle/PP-DocLayout_plus-L_safetensors
32d3ea3
---
license: apache-2.0
library_name: PaddleOCR
language:
- en
- zh
pipeline_tag: image-to-text
tags:
- OCR
- PaddlePaddle
- PaddleOCR
- layout_detection
---
# PP-DocLayout_plus-L
## Introduction
A higher-precision layout area localization model trained on a self-built dataset containing Chinese and English papers, PPT, multi-layout magazines, contracts, books, exams, ancient books and research reports using RT-DETR-L. The layout detection model includes 20 common categories: document title, paragraph title, text, page number, abstract, table, references, footnotes, header, footer, algorithm, formula, formula number, image, table, seal, figure_table title, chart, and sidebar text and lists of references. The key metrics are as follow:
| Model| mAP(0.5) (%) |
| --- | --- |
|PP-DocLayout_plus-L | 83.2 |
**Note**: the evaluation set of the above precision indicators is the self built version sub area detection data set, including Chinese and English papers, magazines, newspapers, research reports PPT、 1000 document type pictures such as test papers and textbooks.
## Model Usage
```python
import requests
from PIL import Image
from transformers import AutoImageProcessor, AutoModelForObjectDetection
model_path = "PaddlePaddle/PP-DocLayout_plus-L_safetensors"
model = AutoModelForObjectDetection.from_pretrained(model_path)
image_processor = AutoImageProcessor.from_pretrained(model_path)
image = Image.open(requests.get("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/layout.jpg", stream=True).raw)
inputs = image_processor(images=image, return_tensors="pt")
outputs = model(**inputs)
results = image_processor.post_process_object_detection(outputs, target_sizes=[image.size[::-1]])
for result in results:
for score, label_id, box in zip(result["scores"], result["labels"], result["boxes"]):
score, label = score.item(), label_id.item()
box = [round(i, 2) for i in box.tolist()]
print(f"{model.config.id2label[label]}: {score:.2f} {box}")
```