--- license: apache-2.0 datasets: - Yaredoffice/geez-characters language: - am - ti pipeline_tag: image-classification library_name: keras tags: - geez - characters - ocr - geez ocr - amharic - tigrinya - onnx --- # Geez Character OCR (Geez-Net) This model is a high-performance Optical Character Recognition (OCR) system specifically designed for the **Geez script** (Amharic, Tigrinya). It utilizes a Convolutional Neural Network (CNN) architecture to classify individual handwritten Geez characters from images with high accuracy. ## Model Details ### Model Description This model addresses the challenge of digital recognition for the Geez script by utilizing a deep CNN architecture. It is trained to accept a single character image and output one of 287 possible character classes. It has been optimized for web deployment using ONNX runtime. - **Developed by:** Yared Kassa - **Shared by:** Yared Kassa - **Model type:** Convolutional Neural Network (CNN) for Image Classification - **Language(s):** Amharic, Tigrinya (Geez Script) - **License:** apache-2.0 - **Finetuned from model:** Trained from scratch ### Model Sources [optional] - **Repository:** [Yaredoffice/geez-characters](https://huggingface.co/Yaredoffice/geez-characters) ## Uses ### Direct Use The model is intended for direct use in digitizing handwritten Geez documents, educational language learning tools, and automated data entry systems. Users input a cropped image of a handwritten character, and the model returns the predicted character class and confidence score. ### Downstream Use [optional] N/A (This is a standalone classification model). ### Out-of-Scope Use The model is **not** designed for: - Full document OCR (it does not perform word segmentation or layout analysis). - Recognition of non-Geez scripts (Latin, Arabic, etc.). - Recognition of cursive or heavily stylized fonts not present in the training data. ## Bias, Risks, and Limitations ### Limitations 1. **Single Character Input:** The model requires pre-segmented single-character images. It cannot process whole words or sentences directly. 2. **Input Quality:** Performance may degrade on low-resolution or highly noisy images without pre-processing. 3. **Data Bias:** While trained on ~400k augmented images, the model may be biased towards the specific handwriting styles present in the original 13k source dataset. ### Recommendations Users should implement a pre-processing pipeline to segment words into individual characters before feeding them into this model. Images should be normalized to 128x128 pixels and converted to grayscale. ## Evaluation ### Testing Data, Factors & Metrics #### Metrics - **Accuracy**: The primary metric used for evaluation. #### Inference Performance - **Single Image Inference**: 81% baseline accuracy. - **Test-Time Augmentation (TTA)**: - Configuration: 10 augmentations with majority voting. - Result: Achieves approximately **90% classification accuracy**. - Impact: Significantly reduces error rates caused by handwriting variability. ## How to Get Started with the Model Use the code below to get started with the model. ```python import onnxruntime as ort import numpy as np from PIL import Image # 1. Load the ONNX model session = ort.InferenceSession("cnn_output.onnx") # 2. Preprocess input image def preprocess_image(image_path): # Load image img = Image.open(image_path).convert('L') # Convert to Grayscale # Resize to 128x128 img = img.resize((128, 128), Image.Resampling.LANCZOS) # Convert to numpy array and normalize to 0-1 img_array = np.array(img).astype('float32') / 255.0 # Add batch dimension and channel dimension (1, 1, 128, 128) img_array = np.expand_dims(np.expand_dims(img_array, axis=0), axis=0) return img_array input_data = preprocess_image("path/to/geez_char.jpg") # 3. Run Inference input_name = session.get_inputs()[0].name output_name = session.get_outputs()[0].name predictions = session.run([output_name], {input_name: input_data})[0] # 4. Get Predicted Class predicted_class_index = np.argmax(predictions) print(f"Predicted Class ID: {predicted_class_index}")