NCUTNLP commited on
Commit
b1ef37e
·
verified ·
1 Parent(s): a81e021

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +103 -60
README.md CHANGED
@@ -1,51 +1,50 @@
1
  # CrossLing-OCR-Mini
2
 
3
- 🚀 **CrossLing-OCR-Mini** is a lightweight yet powerful OCR model designed for **low-resource multilingual and complex-layout document scenarios**.
4
- The model focuses on accurate text recognition while preserving original document structure, making it suitable for multilingual document understanding research.
5
 
6
  ---
7
 
8
- ## 🔍 Model Overview
9
 
10
- CrossLing-OCR-Mini is optimized for **low-resource and structurally complex languages**, achieving strong performance across **11 languages** while remaining deployable on **consumer-grade hardware**.
 
11
 
12
- **Key features:**
13
- - Accurate text recognition with layout/format preservation
14
- - Optimized for low-resource scripts
15
- - Lightweight (~580MB) and easy to deploy
16
- - Designed for research and benchmarking purposes
17
 
18
- ### Supported & Optimized Languages
19
- - High-resource: Chinese, English
20
- - Low-resource (specially optimized):
21
  **Tibetan, Mongolian, Kazakh, Kyrgyz, Zhuang**
22
-
23
- Experimental results show that CrossLing-OCR-Mini **outperforms or matches mainstream OCR systems** on multiple low-resource languages.
24
 
 
25
 
26
- ## 🚀 Usage / Inference
27
-
28
- You can easily perform inference with CrossLing-OCR-Mini using the 🤗 Transformers library.
29
- The following example demonstrates a simple OCR inference pipeline on a single image.
30
-
31
- 🔧 Requirements
32
 
33
- Python 3.8
34
 
35
- transformers (latest recommended)
 
36
 
37
- CUDA-enabled GPU (recommended for better performance)
 
 
 
38
 
39
- ```
40
  pip install -U transformers accelerate
41
- ```
42
 
43
- ## 🧪 Simple OCR Inference Example
44
- ```
 
45
  from transformers import AutoModel, AutoTokenizer
46
- import os
47
 
48
- # Path or Hugging Face model id
49
  model_id = "NCUTNLP/CrossLing-OCR-Mini"
50
 
51
  # Load tokenizer and model
@@ -65,7 +64,7 @@ model = AutoModel.from_pretrained(
65
 
66
  model = model.eval().cuda()
67
 
68
- # Input image for OCR
69
  image_file = "test.png"
70
 
71
  # Perform plain text OCR
@@ -79,67 +78,94 @@ print("Predicted OCR result:\n")
79
  print(result)
80
  ```
81
 
 
 
 
 
 
 
 
82
  ---
83
 
84
- ## 🧪 Performance Notes & Limitations
 
 
85
 
86
- While CrossLing-OCR-Mini achieves strong overall performance, we note that:
87
- - **Mongolian and Uyghur** OCR accuracy still has room for improvement
88
- - Performance may degrade in extremely noisy, handwritten, or out-of-distribution scenarios
89
 
90
- These limitations will be addressed in future iterations of the model.
91
 
92
  ---
93
 
94
- ## 📦 Model Variants
95
 
96
- | Version | Purpose | Availability |
97
- |------|------|------|
98
- | **CrossLing-OCR-Mini** | Research & academic use | ✅ Open-sourced |
99
  | **CrossLing-OCR-Pro-Preview** | Commercial / production use | ���� Contact required |
100
 
101
- 📩 For access to **CrossLing-OCR-Pro-Preview**, please contact:
102
- **zhumx@ncut.edu.cn**, The performance differences between the two different versions of the model are shown in the following figure.
103
 
 
104
 
105
- ![Mini_Pro-Preview](https://cdn-uploads.huggingface.co/production/uploads/6956446a7ebeda1aa80be895/EcKEhwz-6VzPCmHqszIJy.png)
106
 
107
  ---
108
 
109
- ## 🎯 Intended Use
110
 
111
- **This model is intended solely for:**
112
- - Academic research
113
- - Scientific experimentation
114
- - Benchmarking and method comparison
115
- - Low-resource language OCR studies
 
116
 
117
  ---
118
 
119
- ## 🚫 Prohibited Use & Disclaimer
120
 
121
  This model **must not be used** for:
122
- - Any illegal or unlawful activities
123
- - Any applications that violate social ethics, public order, or applicable laws
124
- - Surveillance, discrimination, or harmful decision-making systems
125
 
126
- ⚠️ **Disclaimer**:
127
- - Any misuse of this model is **strictly the responsibility of the user**
128
- - The authors and maintainers **do not endorse** and are **not liable for** any consequences arising from improper or malicious use
129
- - Views or actions enabled by this model **do not reflect the opinions of the authors**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
130
 
131
  ---
132
 
133
- ## ⚖️ License
134
 
135
- This model is released **for research purposes only**.
136
  Commercial use is **not permitted** without explicit authorization.
137
 
138
- (Please contact the authors for commercial licensing or extended usage.)
139
 
140
  ---
141
 
142
- ## 📖 Citation
143
 
144
  If you use CrossLing-OCR-Mini in your research, please cite:
145
 
@@ -150,3 +176,20 @@ If you use CrossLing-OCR-Mini in your research, please cite:
150
  year = {2025},
151
  note = {Research-only OCR model}
152
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # CrossLing-OCR-Mini
2
 
3
+ 🚀 **CrossLing-OCR-Mini** is a lightweight OCR model designed for **low-resource multilingual languages and complex document layouts**.
4
+ The model emphasizes accurate text recognition while preserving original document structure, making it particularly suitable for **multilingual OCR research and academic benchmarking**.
5
 
6
  ---
7
 
8
+ ## 1. Model Overview
9
 
10
+ CrossLing-OCR-Mini targets OCR scenarios involving **low-resource scripts, diverse writing directions, and complex layouts**.
11
+ Despite its compact size (~580MB), the model demonstrates strong recognition performance across **11 languages**, while remaining deployable on **consumer-grade GPUs**.
12
 
13
+ ### Key Features
14
+ - Multilingual OCR with structure-aware text recognition
15
+ - Specialized optimization for low-resource and complex scripts
16
+ - Lightweight (~580MB) and efficient inference
17
+ - Designed exclusively for research and academic benchmarking
18
 
19
+ ### Supported Languages
20
+ - **High-resource languages**: Chinese, English
21
+ - **Low-resource languages (specially optimized)**:
22
  **Tibetan, Mongolian, Kazakh, Kyrgyz, Zhuang**
 
 
23
 
24
+ Experimental results indicate that CrossLing-OCR-Mini **outperforms or matches mainstream OCR systems** on multiple low-resource languages.
25
 
26
+ ---
 
 
 
 
 
27
 
28
+ ## 2. Usage / Inference
29
 
30
+ CrossLing-OCR-Mini can be directly used with the 🤗 **Transformers** library.
31
+ The following example demonstrates **single-image OCR inference** for plain text recognition.
32
 
33
+ ### Requirements
34
+ - Python ≥ 3.8
35
+ - `transformers` (latest version recommended)
36
+ - CUDA-enabled GPU (recommended for optimal performance)
37
 
38
+ ```bash
39
  pip install -U transformers accelerate
40
+ ````
41
 
42
+ ### Simple OCR Inference Example
43
+
44
+ ```python
45
  from transformers import AutoModel, AutoTokenizer
 
46
 
47
+ # Hugging Face model id
48
  model_id = "NCUTNLP/CrossLing-OCR-Mini"
49
 
50
  # Load tokenizer and model
 
64
 
65
  model = model.eval().cuda()
66
 
67
+ # Input image
68
  image_file = "test.png"
69
 
70
  # Perform plain text OCR
 
78
  print(result)
79
  ```
80
 
81
+ ### Notes
82
+
83
+ * `ocr_type="ocr"` enables plain text OCR mode
84
+ * The model automatically handles multilingual text recognition
85
+ * For best results, input images should be clear and upright
86
+ * Consumer-grade GPUs (e.g., RTX 3060 / 3090) are sufficient for inference
87
+
88
  ---
89
 
90
+ ## 3. Performance Notes & Limitations
91
+
92
+ While CrossLing-OCR-Mini achieves strong overall performance, several limitations remain:
93
 
94
+ * OCR accuracy on **Mongolian and Uyghur** still has room for improvement
95
+ * Performance may degrade on extremely noisy, handwritten, or out-of-distribution inputs
 
96
 
97
+ These challenges will be addressed in future versions of the model.
98
 
99
  ---
100
 
101
+ ## 4. Model Variants
102
 
103
+ | Version | Intended Use | Availability |
104
+ | ----------------------------- | --------------------------- | ------------------- |
105
+ | **CrossLing-OCR-Mini** | Research & academic use | ✅ Open-sourced |
106
  | **CrossLing-OCR-Pro-Preview** | Commercial / production use | ���� Contact required |
107
 
108
+ 📩 For access to **CrossLing-OCR-Pro-Preview**, please contact:
109
+ **[zhumx@ncut.edu.cn](mailto:zhumx@ncut.edu.cn)**
110
 
111
+ The performance differences between the Mini and Pro-Preview versions are illustrated below.
112
 
113
+ ![Mini\_Pro-Preview](https://cdn-uploads.huggingface.co/production/uploads/6956446a7ebeda1aa80be895/EcKEhwz-6VzPCmHqszIJy.png)
114
 
115
  ---
116
 
117
+ ## 5. Intended Use
118
 
119
+ This model is **strictly intended for**:
120
+
121
+ * Academic research
122
+ * Scientific experimentation
123
+ * OCR benchmarking and method comparison
124
+ * Low-resource language OCR studies
125
 
126
  ---
127
 
128
+ ## 6. Prohibited Use & Disclaimer
129
 
130
  This model **must not be used** for:
 
 
 
131
 
132
+ * Any illegal or unlawful activities
133
+ * Applications violating social ethics, public order, or applicable laws
134
+ * Surveillance, discrimination, or harmful automated decision-making
135
+
136
+ **Disclaimer**:
137
+
138
+ * Any misuse of this model is **solely the responsibility of the user**
139
+ * The authors and maintainers **do not endorse** and **are not liable for** any consequences arising from improper or malicious use
140
+ * Outputs generated by this model **do not represent the views or positions of the authors**
141
+
142
+ ---
143
+
144
+ ## 7. Ethical Considerations & Bias
145
+
146
+ CrossLing-OCR-Mini is developed to support research on **low-resource and underrepresented languages**.
147
+ However, like all OCR systems, the model may reflect biases present in its training data, including:
148
+
149
+ * Uneven performance across languages and scripts
150
+ * Sensitivity to document quality, typography, and layout styles
151
+
152
+ Users are encouraged to:
153
+
154
+ * Carefully evaluate outputs before downstream use
155
+ * Avoid deploying the model in high-risk or sensitive decision-making scenarios
156
 
157
  ---
158
 
159
+ ## 8. License
160
 
161
+ This model is released **for research purposes only**.
162
  Commercial use is **not permitted** without explicit authorization.
163
 
164
+ For commercial licensing or extended usage, please contact the authors.
165
 
166
  ---
167
 
168
+ ## 9. Citation
169
 
170
  If you use CrossLing-OCR-Mini in your research, please cite:
171
 
 
176
  year = {2025},
177
  note = {Research-only OCR model}
178
  }
179
+ ```
180
+
181
+ ---
182
+
183
+ ## 10. Contact
184
+
185
+ For questions, collaboration, or commercial inquiries:
186
+
187
+ 📧 **[zhumx@ncut.edu.cn](mailto:zhumx@ncut.edu.cn)**
188
+
189
+ ---
190
+
191
+ ## 11. Acknowledgement
192
+
193
+ This project aims to advance **low-resource multilingual OCR research** and contribute to the accessibility of underrepresented languages in the global AI ecosystem.
194
+
195
+ ```