TrOCR Sanskrit OCR
Fine-tuned TrOCR model for Sanskrit manuscript OCR (Optical Character Recognition).
Model Description
This model extracts Sanskrit text from manuscript images. It outputs text in IAST (International Alphabet of Sanskrit Transliteration) format.
- Base Model: microsoft/trocr-base-printed
- Fine-tuned on: yzk/veda-ocr-ms (~11.7k Sanskrit manuscript images)
- Training: 2 epochs on Apple M1 Pro (MPS)
Usage
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
# Load model
processor = TrOCRProcessor.from_pretrained("Piyush3142/trocr-sanskrit-ocr")
model = VisionEncoderDecoderModel.from_pretrained("Piyush3142/trocr-sanskrit-ocr")
# OCR inference
image = Image.open("sanskrit_manuscript.jpg").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values
outputs = model.generate(pixel_values, max_length=256)
text = processor.batch_decode(outputs, skip_special_tokens=True)[0]
print(text) # IAST output
Convert IAST to Devanagari
from indic_transliteration.sanscript import transliterate, IAST, DEVANAGARI
devanagari = transliterate(text, IAST, DEVANAGARI)
print(devanagari)
Training Details
| Parameter | Value |
|---|---|
| Dataset | yzk/veda-ocr-ms |
| Train samples | 10,560 |
| Test samples | 1,174 |
| Epochs | 2 |
| Batch size | 4 |
| Learning rate | 2e-5 |
| CER | ~68% |
| WER | ~80% |
Limitations
- Trained on printed Vedic manuscripts
- Best performance on similar style texts
- May struggle with handwritten or degraded images
License
MIT
- Downloads last month
- 26