NexaAI
/

paddleocr-npu-mobile

Model card Files Files and versions

nexaml commited on Nov 6, 2025

Commit

eb304df

·

verified ·

1 Parent(s): 0d248d6

Create README.md

Files changed (1) hide show

README.md +46 -0

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+pipeline_tag: image-to-text
+tags:
+- NPU
+---
+# PaddleOCR v4 (PP-OCRv4) for Android
+## Quickstart
+See [Documentation](https://docs.nexa.ai/nexa-sdk-android/quickstart)
+## Model Description
+**PP-OCRv4** is the fourth-generation end-to-end optical character recognition system from the PaddlePaddle team.
+It combines a lightweight **text detection → angle classification → text recognition** pipeline with improved training techniques and data augmentation, delivering higher accuracy and robustness while staying efficient for real-time use.
+PP-OCRv4 supports multilingual OCR (Latin and non-Latin scripts), irregular layouts (rotated/curved text), and challenging inputs such as noisy or low-resolution images often found in mobile and document-scan scenarios.
+## Features
+- **End-to-end OCR**: text detection, optional angle classification, and text recognition in one pipeline.
+- **Multilingual support**: pretrained models for English, Chinese, and dozens of other languages; easy finetuning for domain text.
+- **Robust in real-world conditions**: handles rotation, perspective distortion, blur, low light, and complex backgrounds.
+- **Lightweight & fast**: practical for both mobile apps and large-scale server deployments.
+- **Flexible I/O**: works with photos, scans, screenshots, receipts, invoices, ID cards, dashboards, and UI text.
+- **Extensible**: swap components (detector/recognizer), add language packs, or finetune on domain datasets.
+## Use Cases
+- Document digitization (invoices, receipts, forms, contracts)
+- RPA and back-office automation (screen/OCR flows)
+- Mobile scanning apps and camera-based translation/read-aloud
+- Industrial and retail analytics (labels, price tags, shelf tags)
+- Accessibility (screen-readers and read-aloud applications)
+## Inputs and Outputs
+**Input**: Image (photo, scan, or screenshot).
+**Output**: A list of detected text regions, each with:
+- bounding box (rectangular or polygonal)
+- recognized text string
+- optional confidence score and orientation
+## License
+- Licensed under [Apache-2.0](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/LICENSE)
+## References
+- GitHub repo: [https://github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
+- Model zoo & documentation: [Models list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/models_list_en.md)