---
license: apache-2.0
datasets:
- Yaredoffice/geez-characters
language:
- am
- ti
pipeline_tag: image-classification
library_name: keras
tags:
- geez
- characters
- ocr
- geez ocr
- amharic
- tigrinya
- onnx
---

# Geez Character OCR (Geez-Net)

<!-- Provide a quick summary of what the model is/does. -->
This model is a high-performance Optical Character Recognition (OCR) system specifically designed for the **Geez script** (Amharic, Tigrinya). It utilizes a Convolutional Neural Network (CNN) architecture to classify individual handwritten Geez characters from images with high accuracy.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

This model addresses the challenge of digital recognition for the Geez script by utilizing a deep CNN architecture. It is trained to accept a single character image and output one of 287 possible character classes. It has been optimized for web deployment using ONNX runtime.

- **Developed by:** Yared Kassa
- **Shared by:** Yared Kassa
- **Model type:** Convolutional Neural Network (CNN) for Image Classification
- **Language(s):** Amharic, Tigrinya (Geez Script)
- **License:** apache-2.0
- **Finetuned from model:** Trained from scratch

### Model Sources [optional]

<!-- Provide a basic links for the model. -->

- **Repository:** [Yaredoffice/geez-characters](https://huggingface.co/Yaredoffice/geez-characters)

## Uses

<!-- Address questions around how the model is intended to be used, including foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for model use without fine-tuning or plugging into a larger ecosystem/app. -->

The model is intended for direct use in digitizing handwritten Geez documents, educational language learning tools, and automated data entry systems. Users input a cropped image of a handwritten character, and the model returns the predicted character class and confidence score.


### Downstream Use [optional]

<!-- This section is for model use when fine-tuned for a task, or when plugged into a larger ecosystem/app. -->

N/A (This is a standalone classification model).

### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

The model is **not** designed for:
- Full document OCR (it does not perform word segmentation or layout analysis).
- Recognition of non-Geez scripts (Latin, Arabic, etc.).
- Recognition of cursive or heavily stylized fonts not present in the training data.

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

### Limitations

1.  **Single Character Input:** The model requires pre-segmented single-character images. It cannot process whole words or sentences directly.
2.  **Input Quality:** Performance may degrade on low-resolution or highly noisy images without pre-processing.
3.  **Data Bias:** While trained on ~400k augmented images, the model may be biased towards the specific handwriting styles present in the original 13k source dataset.

### Recommendations

<!-- This section is meant to convey recommendations with respect to bias, risk, and technical limitations. -->

Users should implement a pre-processing pipeline to segment words into individual characters before feeding them into this model. Images should be normalized to 128x128 pixels and converted to grayscale.

## Evaluation

### Testing Data, Factors & Metrics

#### Metrics
- **Accuracy**: The primary metric used for evaluation.

#### Inference Performance
- **Single Image Inference**: 81% baseline accuracy.
- **Test-Time Augmentation (TTA)**:
  - Configuration: 10 augmentations with majority voting.
  - Result: Achieves approximately **90% classification accuracy**.
  - Impact: Significantly reduces error rates caused by handwriting variability.

## How to Get Started with the Model

Use the code below to get started with the model.

```python
import onnxruntime as ort
import numpy as np
from PIL import Image

# 1. Load the ONNX model
session = ort.InferenceSession("cnn_output.onnx")

# 2. Preprocess input image
def preprocess_image(image_path):
    # Load image
    img = Image.open(image_path).convert('L') # Convert to Grayscale
    # Resize to 128x128
    img = img.resize((128, 128), Image.Resampling.LANCZOS)
    # Convert to numpy array and normalize to 0-1
    img_array = np.array(img).astype('float32') / 255.0
    # Add batch dimension and channel dimension (1, 1, 128, 128)
    img_array = np.expand_dims(np.expand_dims(img_array, axis=0), axis=0)
    return img_array

input_data = preprocess_image("path/to/geez_char.jpg")

# 3. Run Inference
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
predictions = session.run([output_name], {input_name: input_data})[0]

# 4. Get Predicted Class
predicted_class_index = np.argmax(predictions)
print(f"Predicted Class ID: {predicted_class_index}")