0cve0 commited on
Commit
c4103f0
Β·
verified Β·
1 Parent(s): 4f70d4b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -5
README.md CHANGED
@@ -1,6 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
- **Disclaimer:**
3
- The model weights in this repository are extracted from Google's official GMS (Google Play Services) /
4
- Google translate packages and are the sole property of Google LLC. They are hosted here for research,
5
- educational, and non-commercial purposes only. Any commercial application of these weights is subject to
6
- Google's own terms of service.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - ocr
5
+ - text-detection
6
+ - text-recognition
7
+ - tflite
8
+ - mlkit
9
+ - google-mlkit
10
+ - on-device
11
+ - image-to-text
12
+ pipeline_tag: image-to-text
13
+ language:
14
+ - en
15
+ - ru
16
+ - zh
17
+ - ja
18
+ - ko
19
+ - ar
20
+ - he
21
+ - bn
22
+ - gu
23
+ - kn
24
+ - ml
25
+ - ta
26
+ - te
27
+ - ka
28
+ - vi
29
+ ---
30
 
31
+ # πŸ›οΈ OpenMLkit OCR Models
32
+
33
+ This repository hosts a collection of highly optimized, lightweight, on-device OCR (Optical Character Recognition) models extracted from Google ML Kit APK components. These models are designed to run fully offline and are fully compatible with [OpenMLkitOCR](https://github.com/0cve0/OpenMLkitOCR), a lightweight offline OCR engine for Python.
34
+
35
+ The repository includes:
36
+ * **Text Detection Model**: A Region Proposal Network (RPN) architecture (`rpn_detector.tflite`) that identifies text bounding boxes in image tiles.
37
+ * **Text Recognition Models**: Lightweight CRNN + CTC model pipelines for 15+ different languages and scripts.
38
+
39
+ ---
40
+
41
+ ## πŸ›οΈ Supported Languages and Scripts
42
+
43
+ Below is the registry of the available models, vocabulary maps, and language model priors stored in this repository:
44
+
45
+ | Code | Script / Language | Recognizer Model | Label Map File | Language Model / Priors |
46
+ | :--- | :--- | :--- | :--- | :--- |
47
+ | `detector` | **Text Detection (All)** | [rpn_detector.tflite](./detector/rpn_detector.tflite) | β€” | β€” |
48
+ | `en` | **Latin / English** | [line_recognizer.fb](./en/line_recognizer.fb) | [LabelMap.pb](./en/LabelMap.pb) | β€” |
49
+ | `ru` | **Cyrillic / Russian** | [recognizer_cyrl.tflite](./ru/recognizer_cyrl.tflite) | [LabelMap_cyrl.pb](./ru/LabelMap_cyrl.pb) | FST LM + Priors |
50
+ | `zh` | **Chinese / Han (Hani)** | [recognizer_hani.tflite](./zh/recognizer_hani.tflite) | [recognizer_hani_label_map.pb](./zh/recognizer_hani_label_map.pb) | FST LM + Priors |
51
+ | `ja` | **Japanese (Jpan)** | [recognizer_jpan.tflite](./ja/recognizer_jpan.tflite) | [recognizer_jpan_label_map.pb](./ja/recognizer_jpan_label_map.pb) | FST LM + Priors |
52
+ | `ko` | **Korean (Kore)** | [recognizer_kore.tflite](./ko/recognizer_kore.tflite) | [recognizer_kore_label_map.pb](./ko/recognizer_kore_label_map.pb) | FST LM + Priors |
53
+ | `ar` | **Arabic (Arab)** | [recognizer_arab_retrained.tflite](./ar/recognizer_arab_retrained.tflite) | [recognizer_arab_label_map.pb](./ar/recognizer_arab_label_map.pb) | FST LM + Priors |
54
+ | `he` | **Hebrew (Hebr)** | [hebr.tflite](./he/hebr.tflite) | [hebr_label_map.pb](./he/hebr_label_map.pb) | Priors |
55
+ | `ka` | **Georgian (Geor)** | [geor.tflite](./ka/geor.tflite) | [geor_label_map.pb](./ka/geor_label_map.pb) | Priors |
56
+ | `bn` | **Bengali & Devanagari (Bede)** | [bede.tflite](./bn/bede.tflite) | [bede_label_map.pb](./bn/bede_label_map.pb) | Priors |
57
+ | `gu` | **Gujarati (Gujr)** | [gocr_tflite_recognizer_gujr.tflite](./gu/gocr_tflite_recognizer_gujr.tflite) | [gocr_tflite_recognizer_gujr_label_map.pb](./gu/gocr_tflite_recognizer_gujr_label_map.pb) | Priors |
58
+ | `kn` | **Kannada (Knda)** | [recognizer_knda.tflite](./kn/recognizer_knda.tflite) | [recognizer_knda_label_map.pb](./kn/recognizer_knda_label_map.pb) | FST LM + Priors |
59
+ | `ml` | **Malayalam (Mlym)** | [recognizer_mlym.tflite](./ml/recognizer_mlym.tflite) | [recognizer_mlym_label_map.pb](./ml/recognizer_mlym_label_map.pb) | FST LM + Priors |
60
+ | `ta` | **Tamil (Taml)** | [recognizer_taml.tflite](./ta/recognizer_taml.tflite) | [recognizer_taml_label_map.pb](./ta/recognizer_taml_label_map.pb) | FST LM + Priors |
61
+ | `te` | **Telugu (Telu)** | [recognizer_telu.tflite](./te/recognizer_telu.tflite) | [recognizer_telu_label_map.pb](./te/recognizer_telu_label_map.pb) | FST LM + Priors |
62
+ | `vi` | **Vietnamese / Latin** | [gocr_tflite_recognizer_latn_vi.tflite](./vi/gocr_tflite_recognizer_latn_vi.tflite) | [gocr_tflite_recognizer_latn_vi_label_map.pb](./vi/gocr_tflite_recognizer_latn_vi_label_map.pb) | Priors |
63
+
64
+ ---
65
+
66
+ ## πŸ“‚ File Types Explained
67
+
68
+ 1. **`*.tflite` / `*.fb` (Neural Network weights)**:
69
+ * `detector/rpn_detector.tflite` is a Convolutional Neural Network (CNN) that processes `256x256` tiles of the image and predicts text bounding boxes.
70
+ * `recognizer_*.tflite` and `line_recognizer.fb` are CRNN (Convolutional Recurrent Neural Network) architectures that predict CTC logits for cropped text line images.
71
+ 2. **`*_label_map.pb` / `LabelMap.pb` (Vocabulary)**:
72
+ * Binary Protobuf files mapping character indices to Unicode symbols for decoding CTC outputs.
73
+ 3. **`*_lm.compact_fst.gz` & `*.syms` (Language Models)**:
74
+ * Compact Finite State Transducer (FST) language models and symbol mapping files. These are used for advanced Beam Search decoding, correcting spelling and character sequences based on word frequencies.
75
+ 4. **`*_prior.pb` / `*_config.pb` (Priors & Config)**:
76
+ * Character prior probabilities and model configurations used to calibrate neural network outputs before applying the language model.
77
+
78
+ ---
79
+
80
+ ## πŸš€ How to Use with `openmlkitOCR`
81
+
82
+ The Python package `openmlkitOCR` handles automatic downloading and caching of these models from Hugging Face if they are not present locally.
83
+
84
+ ### 1. Installation
85
+ Install the library directly using pip:
86
+ ```bash
87
+ pip install openmlkitOCR
88
+ ```
89
+
90
+ ### 2. Python Usage Example
91
+ ```python
92
+ import os
93
+ import cv2
94
+ from openmlkit import OpenMLKitOCR
95
+
96
+ # Configure the pipeline to pull models from this Hugging Face repository
97
+ os.environ["OPENMLKIT_MODEL_REPO"] = "0cve0/OpenMLKitOCR"
98
+
99
+ # Initialize the pipeline for a specific language (e.g., 'ru' for Russian/Cyrillic)
100
+ # This will automatically download and cache the detector and recognizer files.
101
+ ocr = OpenMLKitOCR(lang='ru')
102
+
103
+ # Load an image
104
+ image = cv2.imread("test_image.jpg")
105
+
106
+ # Run OCR (detection & recognition)
107
+ results = ocr.run(image, score_threshold=0.35)
108
+
109
+ # Output the localized text bounding boxes and recognised characters
110
+ for item in results:
111
+ print(f"Box: {item['box']} -> Text: {item['text']}")
112
+ ```
113
+
114
+ ---
115
+
116
+ ## βš–οΈ License and Disclaimer
117
+
118
+ * **Software**: The Python library [OpenMLkitOCR](https://github.com/0cve0/OpenMLkitOCR) is licensed under the **Apache 2.0 License**.
119
+ * **Model Weights**: The model weights and configurations in this repository are extracted from Google ML Kit APK components and are subject to Google's terms of service and license agreements. These models are intended for educational, research, and non-commercial local testing purposes.