Image-Text-to-Text
PaddleOCR
Safetensors
English
Chinese
multilingual
paddleocr_vl
ERNIE4.5
PaddlePaddle
image-to-text
ocr
document-parse
layout
table
formula
chart
seal
spotting
conversational
custom_code
Eval Results
Instructions to use PaddlePaddle/PaddleOCR-VL-1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PaddleOCR
How to use PaddlePaddle/PaddleOCR-VL-1.5 with PaddleOCR:
# See https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html to installation from paddleocr import PaddleOCRVL pipeline = PaddleOCRVL(pipeline_version="v1.5") output = pipeline.predict("path/to/document_image.png") for res in output: res.print() res.save_to_json(save_path="output") res.save_to_markdown(save_path="output") - Notebooks
- Google Colab
- Kaggle
Underwhelming performance
#2
by ritheshSree - opened
I have tested this OCR on a bunch of images on their official demo page . lot of accuracy issues in markdown generation as well layout analysis
Find more here: https://youtu.be/kaeKL11E80c
Will it get some improvements?