PaddleOCR-Pytorch / README.md
JoyCN's picture
Update README.md
13b4bab verified
---
license: apache-2.0
language:
- zh
- en
- ja
pipeline_tag: image-to-text
base_model:
- PaddlePaddle/PP-OCRv5_server_det
tags:
- PyTorch
- OCR
---
# Weights
- ptocr_v5_server_det.safetensors — detection (server) weights [recommended]
- ptocr_v5_server_rec.safetensors — recognition (server) weights [recommended]
- ptocr_v5_server_det.pth — detection weights (legacy/compat)
- ptocr_v5_server_rec.pth — recognition weights (legacy/compat)
## Configs
- [PP-OCRv5_server_det.yml](https://huggingface.co/JoyCN/PaddleOCR-Pytorch/blob/main/PP-OCRv5_server_det.yml) — detection model config
- [PP-OCRv5_server_rec.yml](https://huggingface.co/JoyCN/PaddleOCR-Pytorch/blob/main/PP-OCRv5_server_rec.yml) — recognition model config
Download and load examples:
Python:
```python
from huggingface_hub import hf_hub_download
import yaml
# Detection config
cfg_det_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="PP-OCRv5_server_det.yml")
with open(cfg_det_path, "r", encoding="utf-8") as f:
cfg_det = yaml.safe_load(f)
# Recognition config
cfg_rec_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="PP-OCRv5_server_rec.yml")
with open(cfg_rec_path, "r", encoding="utf-8") as f:
cfg_rec = yaml.safe_load(f)
```
Direct links (not counted by default):
- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_det.yml?download=true
- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_rec.yml?download=true
## Download (recommended)
Python:
```python
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
# Download safetensors
det_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="ptocr_v5_server_det.safetensors")
rec_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="ptocr_v5_server_rec.safetensors")
# Load state dicts
sd_det = load_file(det_path, device="cpu")
sd_rec = load_file(rec_path, device="cpu")
# Then
# model_det.load_state_dict(sd_det, strict=False)
# model_rec.load_state_dict(sd_rec, strict=False)
```
Direct links:
- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_det.safetensors?download=true
- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_rec.safetensors?download=true
Legacy (.pth) loading:
```python
import torch
sd_det = torch.load("ptocr_v5_server_det.pth", map_location="cpu", weights_only=True)
if isinstance(sd_det, dict) and "state_dict" in sd_det: sd_det = sd_det["state_dict"]
sd_rec = torch.load("ptocr_v5_server_rec.pth", map_location="cpu", weights_only=True)
if isinstance(sd_rec, dict) and "state_dict" in sd_rec: sd_rec = sd_rec["state_dict"]
# model_det.load_state_dict(sd_det, strict=False)
# model_rec.load_state_dict(sd_rec, strict=False)
```
## System Inference (predict_system.py)
Quick end-to-end OCR with PaddleOCR2Pytorch's system script.
1) Clone code and install minimal deps (CPU example)
```bash
git clone https://github.com/frotms/PaddleOCR2Pytorch.git
cd PaddleOCR2Pytorch
pip install -U torch opencv-python pillow pyyaml safetensors
```
2) Download weights and YAML (put in current folder)
```bash
# Weights (recommended)
curl -L -o ptocr_v5_server_det.safetensors "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_det.safetensors?download=true"
curl -L -o ptocr_v5_server_rec.safetensors "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_rec.safetensors?download=true"
# Configs (YAML)
curl -L -o PP-OCRv5_server_det.yml "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_det.yml?download=true"
curl -L -o PP-OCRv5_server_rec.yml "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_rec.yml?download=true"
```
3) Run end-to-end detection + recognition
```bash
python tools/infer/predict_system.py --use_gpu False --image_dir path/to/your_image.png --det_algorithm DB --det_yaml_path ./PP-OCRv5_server_det.yml --rec_yaml_path ./PP-OCRv5_server_rec.yml --det_model_path ./ptocr_v5_server_det.safetensors --rec_model_path ./ptocr_v5_server_rec.safetensors --rec_char_dict_path ./pytorchocr/utils/dict/ppocrv5_dict.txt --rec_algorithm SVTR --rec_image_shape "3,48,320" --draw_img_save_dir ./inference_results
```
- Set `--use_gpu True` if you have a CUDA-ready environment.
- `--rec_image_shape "3,48,320"` is important for PP-OCRv5 recognition.
- Outputs: detection boxes, recognized text with scores, and a visualization image saved under `--draw_img_save_dir`.
## Notes
- Prefer the `huggingface_hub` API (`hf_hub_download`/`snapshot_download`) for reliable downloads and caching.
- If needed, install safetensors: `pip install safetensors`.
## Compatibility & Attribution
- Example inference uses the open-source PaddleOCR2Pytorch project (Apache-2.0): https://github.com/frotms/PaddleOCR2Pytorch.
- This repository is not affiliated with the PaddleOCR2Pytorch maintainers; please follow their license for code usage.
## License
- This repository (weights and model card) is released under Apache-2.0.
- The referenced PaddleOCR2Pytorch codebase is also Apache-2.0.
## Disclaimer
- Provided as-is, without warranties. Evaluate and validate for your use case.