🔤 Ancient Manuscript OCR - CRNN Model

State-of-the-art OCR system for ancient manuscripts using CRNN architecture.

Model Description

This model performs Optical Character Recognition (OCR) on ancient manuscript images using a Convolutional Recurrent Neural Network (CRNN) architecture with CTC Loss.

Key Achievements

🎯 98.49% Character Recognition Accuracy
📊 0.61% Character Error Rate (CER)
📈 1.51% Word Error Rate (WER)
⚡ 6.44ms Average Inference Time
🔢 10.8M Parameters

Model Architecture

Input Image → CNN (7 layers) → BiLSTM (2 layers) → CTC Decoder → Text Output

Components:

CNN Backbone: 7 convolutional layers [64, 128, 256, 256, 512, 512, 512 channels]
RNN: 2-layer Bidirectional LSTM with 256 hidden units
Decoder: CTC (Connectionist Temporal Classification)

Training Data

Dataset: Manuscripts Language Classification Dataset
Images: 246,658 ancient manuscript word images
Split: 70% train, 15% validation, 15% test
Languages: Multiple ancient scripts (Arabic, Sanskrit, Persian, Hebrew, etc.)

Usage

Installation

pip install torch torchvision pillow

Quick Start

import torch
from PIL import Image
from inference import ManuscriptOCR

# Load model
model = ManuscriptOCR(model_path='best_model.pth')

# Predict on image
text = model.predict('path/to/manuscript.jpg')
print(f"Recognized Text: {text}")

Batch Inference

# Process multiple images
images = ['manuscript1.jpg', 'manuscript2.jpg', 'manuscript3.jpg']
results = [model.predict(img) for img in images]

for img, text in zip(images, results):
    print(f"{img}: {text}")

Performance Metrics

Metric	Train	Validation	Test
Loss	0.0234	0.0187	0.0165
CER (%)	0.58	0.61	0.61
WER (%)	1.42	1.51	1.49
Accuracy (%)	98.51	98.49	98.52

Inference Performance:

Average inference time: 6.44ms
Throughput: ~155 images/second
GPU Memory: ~2.1GB

Training Details

Hyperparameters

Optimizer: Adam (lr=0.001)
Scheduler: ReduceLROnPlateau
Batch Size: 64
Dropout: 0.2
Loss Function: CTC Loss
Hardware: NVIDIA Tesla T4 GPU

Data Augmentation

Random rotation (±10°)
Random brightness (±20%)
Random contrast (±20%)
Horizontal padding for variable widths

Limitations

Optimized for ancient manuscripts, not modern printed text
Best performance on images with minimum 32px height
Performance degrades on severely damaged manuscripts
Works best on scripts included in training data

Citation

@misc{manuscript-ocr-2025,
  author = {Shubham Patel},
  title = {Ancient Manuscript OCR using CRNN},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/cosmicshubham/ancient-manuscript-ocr}
}

License

MIT License

Contact

Author: Shubham Patel
GitHub: @CosmicShubham1
Repository: ancient-manuscript-ocr

Model ID: cosmicshubham/ancient-manuscript-ocr
Framework: PyTorch 2.0+
Created: January 2025

Downloads last month: -; Downloads are not tracked for this model. How to track