NCUTNLP
/

CrossLing-OCR-Mini

Safetensors

GOT

custom_code

Model card Files Files and versions

xet

Community

NCUTNLP commited on Jan 1

Commit

b1ef37e

verified ·

1 Parent(s): a81e021

Update README.md

Browse files

Files changed (1) hide show

README.md +103 -60

README.md CHANGED Viewed

@@ -1,51 +1,50 @@
 # CrossLing-OCR-Mini
-🚀 **CrossLing-OCR-Mini** is a lightweight yet powerful OCR model designed for **low-resource multilingual and complex-layout document scenarios**.
-The model focuses on accurate text recognition while preserving original document structure, making it suitable for multilingual document understanding research.
 ---
-## 🔍 Model Overview
-CrossLing-OCR-Mini is optimized for **low-resource and structurally complex languages**, achieving strong performance across **11 languages** while remaining deployable on **consumer-grade hardware**.
-**Key features:**
-- Accurate text recognition with layout/format preservation
-- Optimized for low-resource scripts
-- Lightweight (~580MB) and easy to deploy
-- Designed for research and benchmarking purposes
-### Supported & Optimized Languages
-- High-resource: Chinese, English
-- Low-resource (specially optimized):
   **Tibetan, Mongolian, Kazakh, Kyrgyz, Zhuang**
-Experimental results show that CrossLing-OCR-Mini **outperforms or matches mainstream OCR systems** on multiple low-resource languages.
-## 🚀 Usage / Inference
-You can easily perform inference with CrossLing-OCR-Mini using the 🤗 Transformers library.
-The following example demonstrates a simple OCR inference pipeline on a single image.
-🔧 Requirements
-Python ≥ 3.8
-transformers (latest recommended)
-CUDA-enabled GPU (recommended for better performance)
-```
 pip install -U transformers accelerate
-```
-## 🧪 Simple OCR Inference Example
-```
 from transformers import AutoModel, AutoTokenizer
-import os
-# Path or Hugging Face model id
 model_id = "NCUTNLP/CrossLing-OCR-Mini"
 # Load tokenizer and model
@@ -65,7 +64,7 @@ model = AutoModel.from_pretrained(
 model = model.eval().cuda()
-# Input image for OCR
 image_file = "test.png"
 # Perform plain text OCR
@@ -79,67 +78,94 @@ print("Predicted OCR result:\n")
 print(result)
 ```
 ---
-## 🧪 Performance Notes & Limitations
-While CrossLing-OCR-Mini achieves strong overall performance, we note that:
-- **Mongolian and Uyghur** OCR accuracy still has room for improvement
-- Performance may degrade in extremely noisy, handwritten, or out-of-distribution scenarios
-These limitations will be addressed in future iterations of the model.
 ---
-## 📦 Model Variants
-| Version | Purpose | Availability |
-|------|------|------|
-| **CrossLing-OCR-Mini** | Research & academic use | ✅ Open-sourced |
 | **CrossLing-OCR-Pro-Preview** | Commercial / production use | ���� Contact required |
-📩 For access to **CrossLing-OCR-Pro-Preview**, please contact:
-**zhumx@ncut.edu.cn**, The performance differences between the two different versions of the model are shown in the following figure.
-![Mini_Pro-Preview](https://cdn-uploads.huggingface.co/production/uploads/6956446a7ebeda1aa80be895/EcKEhwz-6VzPCmHqszIJy.png)
 ---
-## 🎯 Intended Use
-**This model is intended solely for:**
-- Academic research
-- Scientific experimentation
-- Benchmarking and method comparison
-- Low-resource language OCR studies
 ---
-## 🚫 Prohibited Use & Disclaimer
 This model **must not be used** for:
-- Any illegal or unlawful activities
-- Any applications that violate social ethics, public order, or applicable laws
-- Surveillance, discrimination, or harmful decision-making systems
-⚠️ **Disclaimer**:
-- Any misuse of this model is **strictly the responsibility of the user**
-- The authors and maintainers **do not endorse** and are **not liable for** any consequences arising from improper or malicious use
-- Views or actions enabled by this model **do not reflect the opinions of the authors**
 ---
-## ⚖️ License
-This model is released **for research purposes only**.
 Commercial use is **not permitted** without explicit authorization.
-(Please contact the authors for commercial licensing or extended usage.)
 ---
-## 📖 Citation
 If you use CrossLing-OCR-Mini in your research, please cite:
@@ -150,3 +176,20 @@ If you use CrossLing-OCR-Mini in your research, please cite:
   year      = {2025},
   note      = {Research-only OCR model}
 }

 # CrossLing-OCR-Mini
+🚀 **CrossLing-OCR-Mini** is a lightweight OCR model designed for **low-resource multilingual languages and complex document layouts**.
+The model emphasizes accurate text recognition while preserving original document structure, making it particularly suitable for **multilingual OCR research and academic benchmarking**.
 ---
+## 1. Model Overview
+CrossLing-OCR-Mini targets OCR scenarios involving **low-resource scripts, diverse writing directions, and complex layouts**.
+Despite its compact size (~580MB), the model demonstrates strong recognition performance across **11 languages**, while remaining deployable on **consumer-grade GPUs**.
+### Key Features
+- Multilingual OCR with structure-aware text recognition
+- Specialized optimization for low-resource and complex scripts
+- Lightweight (~580MB) and efficient inference
+- Designed exclusively for research and academic benchmarking
+### Supported Languages
+- **High-resource languages**: Chinese, English
+- **Low-resource languages (specially optimized)**:
   **Tibetan, Mongolian, Kazakh, Kyrgyz, Zhuang**
+Experimental results indicate that CrossLing-OCR-Mini **outperforms or matches mainstream OCR systems** on multiple low-resource languages.
+---
+## 2. Usage / Inference
+CrossLing-OCR-Mini can be directly used with the 🤗 **Transformers** library.
+The following example demonstrates **single-image OCR inference** for plain text recognition.
+### Requirements
+- Python ≥ 3.8
+- `transformers` (latest version recommended)
+- CUDA-enabled GPU (recommended for optimal performance)
+```bash
 pip install -U transformers accelerate
+````
+### Simple OCR Inference Example
+```python
 from transformers import AutoModel, AutoTokenizer
+# Hugging Face model id
 model_id = "NCUTNLP/CrossLing-OCR-Mini"
 # Load tokenizer and model
 model = model.eval().cuda()
+# Input image
 image_file = "test.png"
 # Perform plain text OCR
 print(result)
 ```
+### Notes
+* `ocr_type="ocr"` enables plain text OCR mode
+* The model automatically handles multilingual text recognition
+* For best results, input images should be clear and upright
+* Consumer-grade GPUs (e.g., RTX 3060 / 3090) are sufficient for inference
 ---
+## 3. Performance Notes & Limitations
+While CrossLing-OCR-Mini achieves strong overall performance, several limitations remain:
+* OCR accuracy on **Mongolian and Uyghur** still has room for improvement
+* Performance may degrade on extremely noisy, handwritten, or out-of-distribution inputs
+These challenges will be addressed in future versions of the model.
 ---
+## 4. Model Variants
+| Version                       | Intended Use                | Availability        |
+| ----------------------------- | --------------------------- | ------------------- |
+| **CrossLing-OCR-Mini**        | Research & academic use     | ✅ Open-sourced      |
 | **CrossLing-OCR-Pro-Preview** | Commercial / production use | ���� Contact required |
+📩 For access to **CrossLing-OCR-Pro-Preview**, please contact:
+**[zhumx@ncut.edu.cn](mailto:zhumx@ncut.edu.cn)**
+The performance differences between the Mini and Pro-Preview versions are illustrated below.
+![Mini\_Pro-Preview](https://cdn-uploads.huggingface.co/production/uploads/6956446a7ebeda1aa80be895/EcKEhwz-6VzPCmHqszIJy.png)
 ---
+## 5. Intended Use
+This model is **strictly intended for**:
+* Academic research
+* Scientific experimentation
+* OCR benchmarking and method comparison
+* Low-resource language OCR studies
 ---
+## 6. Prohibited Use & Disclaimer
 This model **must not be used** for:
+* Any illegal or unlawful activities
+* Applications violating social ethics, public order, or applicable laws
+* Surveillance, discrimination, or harmful automated decision-making
+**Disclaimer**:
+* Any misuse of this model is **solely the responsibility of the user**
+* The authors and maintainers **do not endorse** and **are not liable for** any consequences arising from improper or malicious use
+* Outputs generated by this model **do not represent the views or positions of the authors**
+---
+## 7. Ethical Considerations & Bias
+CrossLing-OCR-Mini is developed to support research on **low-resource and underrepresented languages**.
+However, like all OCR systems, the model may reflect biases present in its training data, including:
+* Uneven performance across languages and scripts
+* Sensitivity to document quality, typography, and layout styles
+Users are encouraged to:
+* Carefully evaluate outputs before downstream use
+* Avoid deploying the model in high-risk or sensitive decision-making scenarios
 ---
+## 8. License
+This model is released **for research purposes only**.
 Commercial use is **not permitted** without explicit authorization.
+For commercial licensing or extended usage, please contact the authors.
 ---
+## 9. Citation
 If you use CrossLing-OCR-Mini in your research, please cite:
   year      = {2025},
   note      = {Research-only OCR model}
 }
+```
+---
+## 10. Contact
+For questions, collaboration, or commercial inquiries:
+📧 **[zhumx@ncut.edu.cn](mailto:zhumx@ncut.edu.cn)**
+---
+## 11. Acknowledgement
+This project aims to advance **low-resource multilingual OCR research** and contribute to the accessibility of underrepresented languages in the global AI ecosystem.
+```