--- pipeline_tag: image-to-text tags: - NPU --- # PaddleOCR v4 (PP-OCRv4) for Android ## Quickstart See [Documentation](https://docs.nexa.ai/nexa-sdk-android/quickstart) ## Model Description **PP-OCRv4** is the fourth-generation end-to-end optical character recognition system from the PaddlePaddle team. It combines a lightweight **text detection → angle classification → text recognition** pipeline with improved training techniques and data augmentation, delivering higher accuracy and robustness while staying efficient for real-time use. PP-OCRv4 supports multilingual OCR (Latin and non-Latin scripts), irregular layouts (rotated/curved text), and challenging inputs such as noisy or low-resolution images often found in mobile and document-scan scenarios. ## Features - **End-to-end OCR**: text detection, optional angle classification, and text recognition in one pipeline. - **Multilingual support**: pretrained models for English, Chinese, and dozens of other languages; easy finetuning for domain text. - **Robust in real-world conditions**: handles rotation, perspective distortion, blur, low light, and complex backgrounds. - **Lightweight & fast**: practical for both mobile apps and large-scale server deployments. - **Flexible I/O**: works with photos, scans, screenshots, receipts, invoices, ID cards, dashboards, and UI text. - **Extensible**: swap components (detector/recognizer), add language packs, or finetune on domain datasets. ## Use Cases - Document digitization (invoices, receipts, forms, contracts) - RPA and back-office automation (screen/OCR flows) - Mobile scanning apps and camera-based translation/read-aloud - Industrial and retail analytics (labels, price tags, shelf tags) - Accessibility (screen-readers and read-aloud applications) ## Inputs and Outputs **Input**: Image (photo, scan, or screenshot). **Output**: A list of detected text regions, each with: - bounding box (rectangular or polygonal) - recognized text string - optional confidence score and orientation ## License This model is released under the **Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0)** license. Non-commercial use, modification, and redistribution are permitted with attribution. For commercial licensing, please contact **dev@nexa.ai**. ## References - GitHub repo: [https://github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) - Model zoo & documentation: [Models list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/models_list_en.md)