| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - Yaredoffice/geez-characters |
| | language: |
| | - am |
| | - ti |
| | pipeline_tag: image-classification |
| | library_name: keras |
| | tags: |
| | - geez |
| | - characters |
| | - ocr |
| | - geez ocr |
| | - amharic |
| | - tigrinya |
| | - onnx |
| | --- |
| | |
| | # Geez Character OCR (Geez-Net) |
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| | This model is a high-performance Optical Character Recognition (OCR) system specifically designed for the **Geez script** (Amharic, Tigrinya). It utilizes a Convolutional Neural Network (CNN) architecture to classify individual handwritten Geez characters from images with high accuracy. |
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | <!-- Provide a longer summary of what this model is. --> |
| |
|
| | This model addresses the challenge of digital recognition for the Geez script by utilizing a deep CNN architecture. It is trained to accept a single character image and output one of 287 possible character classes. It has been optimized for web deployment using ONNX runtime. |
| |
|
| | - **Developed by:** Yared Kassa |
| | - **Shared by:** Yared Kassa |
| | - **Model type:** Convolutional Neural Network (CNN) for Image Classification |
| | - **Language(s):** Amharic, Tigrinya (Geez Script) |
| | - **License:** apache-2.0 |
| | - **Finetuned from model:** Trained from scratch |
| |
|
| | ### Model Sources [optional] |
| |
|
| | <!-- Provide a basic links for the model. --> |
| |
|
| | - **Repository:** [Yaredoffice/geez-characters](https://huggingface.co/Yaredoffice/geez-characters) |
| |
|
| | ## Uses |
| |
|
| | <!-- Address questions around how the model is intended to be used, including foreseeable users of the model and those affected by the model. --> |
| |
|
| | ### Direct Use |
| |
|
| | <!-- This section is for model use without fine-tuning or plugging into a larger ecosystem/app. --> |
| |
|
| | The model is intended for direct use in digitizing handwritten Geez documents, educational language learning tools, and automated data entry systems. Users input a cropped image of a handwritten character, and the model returns the predicted character class and confidence score. |
| |
|
| |
|
| | ### Downstream Use [optional] |
| |
|
| | <!-- This section is for model use when fine-tuned for a task, or when plugged into a larger ecosystem/app. --> |
| |
|
| | N/A (This is a standalone classification model). |
| |
|
| | ### Out-of-Scope Use |
| |
|
| | <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
| |
|
| | The model is **not** designed for: |
| | - Full document OCR (it does not perform word segmentation or layout analysis). |
| | - Recognition of non-Geez scripts (Latin, Arabic, etc.). |
| | - Recognition of cursive or heavily stylized fonts not present in the training data. |
| |
|
| | ## Bias, Risks, and Limitations |
| |
|
| | <!-- This section is meant to convey both technical and sociotechnical limitations. --> |
| |
|
| | ### Limitations |
| |
|
| | 1. **Single Character Input:** The model requires pre-segmented single-character images. It cannot process whole words or sentences directly. |
| | 2. **Input Quality:** Performance may degrade on low-resolution or highly noisy images without pre-processing. |
| | 3. **Data Bias:** While trained on ~400k augmented images, the model may be biased towards the specific handwriting styles present in the original 13k source dataset. |
| |
|
| | ### Recommendations |
| |
|
| | <!-- This section is meant to convey recommendations with respect to bias, risk, and technical limitations. --> |
| |
|
| | Users should implement a pre-processing pipeline to segment words into individual characters before feeding them into this model. Images should be normalized to 128x128 pixels and converted to grayscale. |
| |
|
| | ## Evaluation |
| |
|
| | ### Testing Data, Factors & Metrics |
| |
|
| | #### Metrics |
| | - **Accuracy**: The primary metric used for evaluation. |
| |
|
| | #### Inference Performance |
| | - **Single Image Inference**: 81% baseline accuracy. |
| | - **Test-Time Augmentation (TTA)**: |
| | - Configuration: 10 augmentations with majority voting. |
| | - Result: Achieves approximately **90% classification accuracy**. |
| | - Impact: Significantly reduces error rates caused by handwriting variability. |
| |
|
| | ## How to Get Started with the Model |
| |
|
| | Use the code below to get started with the model. |
| |
|
| | ```python |
| | import onnxruntime as ort |
| | import numpy as np |
| | from PIL import Image |
| | |
| | # 1. Load the ONNX model |
| | session = ort.InferenceSession("cnn_output.onnx") |
| | |
| | # 2. Preprocess input image |
| | def preprocess_image(image_path): |
| | # Load image |
| | img = Image.open(image_path).convert('L') # Convert to Grayscale |
| | # Resize to 128x128 |
| | img = img.resize((128, 128), Image.Resampling.LANCZOS) |
| | # Convert to numpy array and normalize to 0-1 |
| | img_array = np.array(img).astype('float32') / 255.0 |
| | # Add batch dimension and channel dimension (1, 1, 128, 128) |
| | img_array = np.expand_dims(np.expand_dims(img_array, axis=0), axis=0) |
| | return img_array |
| | |
| | input_data = preprocess_image("path/to/geez_char.jpg") |
| | |
| | # 3. Run Inference |
| | input_name = session.get_inputs()[0].name |
| | output_name = session.get_outputs()[0].name |
| | predictions = session.run([output_name], {input_name: input_data})[0] |
| | |
| | # 4. Get Predicted Class |
| | predicted_class_index = np.argmax(predictions) |
| | print(f"Predicted Class ID: {predicted_class_index}") |