Qalam-Net Banner

๐Ÿ–‹๏ธ Qalam-Net (ู‚ู„ู…-ู†ุช)

High-Performance, Cross-Backend Arabic OCR

License Framework Backend


๐ŸŒŸ Highlights

  • ๐Ÿš€ Ultra-Fast Inference: Native JAX/XLA support for accelerated processing.
  • ๐Ÿงฉ Portable Architecture: Patched (v2) to resolve serialization issues across Keras versions.
  • ๐ŸŽฏ Precision Driven: CNN + BiLSTM + Self-Attention pipeline optimized for Arabic script.
  • ๐Ÿ”“ Unified Loading: No custom layers or complex setup required for inference.

๐Ÿ“– How it Works

The model processes Arabic text images through a sophisticated multi-stage pipeline:

graph LR
    A[Input Image 128x32] --> B[CNN Backbone]
    B --> C[Spatial Features]
    C --> D[Dual BiLSTM]
    D --> E[Self-Attention]
    E --> F[Softmax Output]
    F --> G[NumPy CTC Decoder]
    G --> H[Arabic Text]

๐Ÿš€ Quick Start (Robust Usage)

Use the following implementation to run inference on any platform. This uses a custom NumPy-based decoder for 100% cross-backend compatibility.

View Python Implementation
import os
os.environ["KERAS_BACKEND"] = "jax" # Options: "jax", "tensorflow", "torch"

import keras
import numpy as np
import cv2
from huggingface_hub import hf_hub_download

class QalamNet:
    def __init__(self, repo_id="Ali0044/Qalam-Net"):
        # 1. Download and Load Model
        print(f"Loading Qalam-Net from {repo_id}...")
        model_path = hf_hub_download(repo_id=repo_id, filename="model.keras")
        self.model = keras.saving.load_model(model_path)
        
        # 2. Define the exact 38-character Arabic Vocabulary
        # [ALIF, BA, TA, THA, JEEM, HAA, KHAA, DAL, THAL, RA, ZAY, SEEN, SHEEN, SAD, DAD, TAA, ZAA, AIN, GHAIN, FA, QAF, KAF, LAM, MEEM, NOON, HA, WAW, YA, TEH_MARBUTA, ALEF_MAKSURA, ALEF_HAMZA_ABOVE, ALEF_HAMZA_BELOW, ALEF_MADDA, WAW_HAMZA, YEH_HAMZA, HAMZA, SPACE, TATWEEL]
        self.vocab = ['ุง', 'ุจ', 'ุช', 'ุซ', 'ุฌ', 'ุญ', 'ุฎ', 'ุฏ', 'ุฐ', 'ุฑ', 'ุฒ', 'ุณ', 'ุด', 'ุต', 'ุถ', 'ุท', 'ุธ', 'ุน', 'ุบ', 'ู', 'ู‚', 'ูƒ', 'ู„', 'ู…', 'ู†', 'ู‡', 'ูˆ', 'ูŠ', 'ุฉ', 'ู‰', 'ุฃ', 'ุฅ', 'ุข', 'ุค', 'ุฆ', 'ุก', ' ', 'ู€']
        
    def preprocess(self, image_path):
        img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
        img = cv2.resize(img, (128, 32)) / 255.0
        img = img.T # Transpose for CRNN architecture
        img = np.expand_dims(img, axis=(-1, 0))
        return img.astype(np.float32)

    def predict(self, image_path):
        batch_img = self.preprocess(image_path)
        preds = self.model.predict(batch_img) # Output shape: (1, 32, 39)
        
        # 3. NumPy-based CTC Greedy Decoding (Cross-Backend)
        argmax_preds = np.argmax(preds, axis=-1)[0]
        
        # Remove consecutive duplicates
        unique_indices = [argmax_preds[i] for i in range(len(argmax_preds)) 
                          if i == 0 or argmax_preds[i] != argmax_preds[i-1]]
        
        # Remove blank index (index 38)
        blank_index = preds.shape[-1] - 1
        final_indices = [idx for idx in unique_indices if idx != blank_index]
        
        # Map to vocabulary
        return "".join([self.vocab[idx] for idx in final_indices if idx < len(self.vocab)])

# Usage
ocr = QalamNet()
print(f"Predicted Arabic Text: {ocr.predict('/content/images.png')}")

๐Ÿ“Š Performance & Metrics

Training was conducted on the mssqpi/Arabic-OCR-Dataset over 50 epochs.

Metric Value
Input Shape 128 x 32 x 1 (Grayscale)
Output Classes 39 (38 Chars + 1 Blank)
Final Loss ~13.13
Val Loss ~89.79
Framework Keras 3.x (Native)

๐Ÿ“ Dataset

This model was trained on the Arabic-OCR-Dataset provided by Muhammad AL-Qurishi (mssqpi).

  • Total Samples: ~2.16 Million images.
  • Content: A massive collection of Arabic text lines in various fonts and styles.
  • Usage: Used for training the CRNN architecture to recognize sequential Arabic script.

๐Ÿค Acknowledgments

Developed and maintained by Ali Khalid. This model is part of a comparative research study on Arabic OCR architectures.


Pro Tip: Use the JAX backend for the fastest inference times on modern CPUs and GPUs!

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 1 Ask for provider support

Dataset used to train Ali0044/Qalam-Net