Image-Text-to-Text
PaddleOCR
Safetensors
English
Chinese
multilingual
paddleocr_vl
ERNIE4.5
PaddlePaddle
image-to-text
ocr
document-parse
layout
table
formula
chart
seal
spotting
conversational
custom_code
Eval Results
Instructions to use PaddlePaddle/PaddleOCR-VL-1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PaddleOCR
How to use PaddlePaddle/PaddleOCR-VL-1.5 with PaddleOCR:
# See https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html to installation from paddleocr import PaddleOCRVL pipeline = PaddleOCRVL(pipeline_version="v1.5") output = pipeline.predict("path/to/document_image.png") for res in output: res.print() res.save_to_json(save_path="output") res.save_to_markdown(save_path="output") - Notebooks
- Google Colab
- Kaggle
This model doesn't compatible with transformers 5.x version
#11
by 5Va2mm - opened
The document mentioned:
ensure the transformers v5 is installed
However modeling_paddleocr_vl.py uses SlidingWindowCache which was removed in 5.x version:
ImportError: cannot import name 'SlidingWindowCache' from 'transformers.cache_utils'
Downgrade to the latest transformers 4.x version work well.
will the error be fixed? its a problem that PP-DocLaouytV3 requires transformers>=5.X, but VLM part transformers<5.X ...