Image-Text-to-Text
PaddleOCR
Safetensors
English
Chinese
multilingual
paddleocr_vl
ERNIE4.5
PaddlePaddle
image-to-text
ocr
document-parse
layout
table
formula
chart
conversational
custom_code
Eval Results
Instructions to use PaddlePaddle/PaddleOCR-VL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PaddleOCR
How to use PaddlePaddle/PaddleOCR-VL with PaddleOCR:
# See https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html to installation from paddleocr import PaddleOCRVL pipeline = PaddleOCRVL(pipeline_version="v1") output = pipeline.predict("path/to/document_image.png") for res in output: res.print() res.save_to_json(save_path="output") res.save_to_markdown(save_path="output") - Notebooks
- Google Colab
- Kaggle
File size: 1,482 Bytes
6e98c1c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | mode: paddle
draw_threshold: 0.5
metric: COCO
use_dynamic_shape: false
Global:
model_name: PP-DocLayoutV2
arch: DETR
min_subgraph_size: 3
Preprocess:
- interp: 2
keep_ratio: false
target_size:
- 800
- 800
type: Resize
- mean:
- 0.0
- 0.0
- 0.0
norm_type: none
std:
- 1.0
- 1.0
- 1.0
type: NormalizeImage
- type: Permute
label_list:
- abstract
- algorithm
- aside_text
- chart
- content
- display_formula
- doc_title
- figure_title
- footer
- footer_image
- footnote
- formula_number
- header
- header_image
- image
- inline_formula
- number
- paragraph_title
- reference
- reference_content
- seal
- table
- text
- vertical_text
- vision_footnote
Hpi:
backend_configs:
paddle_infer:
trt_dynamic_shapes: &id001
image:
- - 1
- 3
- 800
- 800
- - 1
- 3
- 800
- 800
- - 8
- 3
- 800
- 800
scale_factor:
- - 1
- 2
- - 1
- 2
- - 8
- 2
trt_dynamic_shape_input_data:
scale_factor:
- - 2
- 2
- - 1
- 1
- - 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
- 0.67
tensorrt:
dynamic_shapes: *id001
|