PaddleOCR-Pytorch / README.md

Update README.md

13b4bab verified 5 months ago

5.27 kB

	---
	license: apache-2.0
	language:
	- zh
	- en
	- ja
	pipeline_tag: image-to-text
	base_model:
	- PaddlePaddle/PP-OCRv5_server_det
	tags:
	- PyTorch
	- OCR
	---

	# Weights

	- ptocr_v5_server_det.safetensors — detection (server) weights [recommended]
	- ptocr_v5_server_rec.safetensors — recognition (server) weights [recommended]
	- ptocr_v5_server_det.pth — detection weights (legacy/compat)
	- ptocr_v5_server_rec.pth — recognition weights (legacy/compat)

	## Configs

	- [PP-OCRv5_server_det.yml](https://huggingface.co/JoyCN/PaddleOCR-Pytorch/blob/main/PP-OCRv5_server_det.yml) — detection model config
	- [PP-OCRv5_server_rec.yml](https://huggingface.co/JoyCN/PaddleOCR-Pytorch/blob/main/PP-OCRv5_server_rec.yml) — recognition model config

	Download and load examples:

	Python:

	```python
	from huggingface_hub import hf_hub_download
	import yaml

	# Detection config
	cfg_det_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="PP-OCRv5_server_det.yml")
	with open(cfg_det_path, "r", encoding="utf-8") as f:
	cfg_det = yaml.safe_load(f)

	# Recognition config
	cfg_rec_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="PP-OCRv5_server_rec.yml")
	with open(cfg_rec_path, "r", encoding="utf-8") as f:
	cfg_rec = yaml.safe_load(f)
	```

	Direct links (not counted by default):

	- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_det.yml?download=true
	- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_rec.yml?download=true

	## Download (recommended)

	Python:

	```python
	from huggingface_hub import hf_hub_download
	from safetensors.torch import load_file

	# Download safetensors
	det_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="ptocr_v5_server_det.safetensors")
	rec_path = hf_hub_download("JoyCN/PaddleOCR-Pytorch", filename="ptocr_v5_server_rec.safetensors")

	# Load state dicts
	sd_det = load_file(det_path, device="cpu")
	sd_rec = load_file(rec_path, device="cpu")

	# Then
	# model_det.load_state_dict(sd_det, strict=False)
	# model_rec.load_state_dict(sd_rec, strict=False)
	```

	Direct links:

	- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_det.safetensors?download=true
	- https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_rec.safetensors?download=true

	Legacy (.pth) loading:

	```python
	import torch
	sd_det = torch.load("ptocr_v5_server_det.pth", map_location="cpu", weights_only=True)
	if isinstance(sd_det, dict) and "state_dict" in sd_det: sd_det = sd_det["state_dict"]
	sd_rec = torch.load("ptocr_v5_server_rec.pth", map_location="cpu", weights_only=True)
	if isinstance(sd_rec, dict) and "state_dict" in sd_rec: sd_rec = sd_rec["state_dict"]
	# model_det.load_state_dict(sd_det, strict=False)
	# model_rec.load_state_dict(sd_rec, strict=False)
	```


	## System Inference (predict_system.py)

	Quick end-to-end OCR with PaddleOCR2Pytorch's system script.

	1) Clone code and install minimal deps (CPU example)

	```bash
	git clone https://github.com/frotms/PaddleOCR2Pytorch.git
	cd PaddleOCR2Pytorch
	pip install -U torch opencv-python pillow pyyaml safetensors
	```

	2) Download weights and YAML (put in current folder)

	```bash
	# Weights (recommended)
	curl -L -o ptocr_v5_server_det.safetensors "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_det.safetensors?download=true"
	curl -L -o ptocr_v5_server_rec.safetensors "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/ptocr_v5_server_rec.safetensors?download=true"

	# Configs (YAML)
	curl -L -o PP-OCRv5_server_det.yml "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_det.yml?download=true"
	curl -L -o PP-OCRv5_server_rec.yml "https://huggingface.co/JoyCN/PaddleOCR-Pytorch/resolve/main/PP-OCRv5_server_rec.yml?download=true"
	```

	3) Run end-to-end detection + recognition

	```bash
	python tools/infer/predict_system.py --use_gpu False --image_dir path/to/your_image.png --det_algorithm DB --det_yaml_path ./PP-OCRv5_server_det.yml --rec_yaml_path ./PP-OCRv5_server_rec.yml --det_model_path ./ptocr_v5_server_det.safetensors --rec_model_path ./ptocr_v5_server_rec.safetensors --rec_char_dict_path ./pytorchocr/utils/dict/ppocrv5_dict.txt --rec_algorithm SVTR --rec_image_shape "3,48,320" --draw_img_save_dir ./inference_results
	```

	- Set `--use_gpu True` if you have a CUDA-ready environment.
	- `--rec_image_shape "3,48,320"` is important for PP-OCRv5 recognition.
	- Outputs: detection boxes, recognized text with scores, and a visualization image saved under `--draw_img_save_dir`.


	## Notes

	- Prefer the `huggingface_hub` API (`hf_hub_download`/`snapshot_download`) for reliable downloads and caching.
	- If needed, install safetensors: `pip install safetensors`.

	## Compatibility & Attribution

	- Example inference uses the open-source PaddleOCR2Pytorch project (Apache-2.0): https://github.com/frotms/PaddleOCR2Pytorch.
	- This repository is not affiliated with the PaddleOCR2Pytorch maintainers; please follow their license for code usage.

	## License

	- This repository (weights and model card) is released under Apache-2.0.
	- The referenced PaddleOCR2Pytorch codebase is also Apache-2.0.

	## Disclaimer

	- Provided as-is, without warranties. Evaluate and validate for your use case.