DevGen TrOCR Devanagari LoRA Adapter

This repository contains the DevGen LoRA adapter for Devanagari OCR. It is designed to be loaded on top of paudelanil/trocr-devanagari-2 using the PEFT (Parameter-Efficient Fine-Tuning) library.

Model Details

Developed by: Manish Wagle / DevGen
Base model: paudelanil/trocr-devanagari-2
Adapter type: LoRA
Task: Image-to-text OCR for Devanagari handwritten words and short phrases.
Library: PEFT + Transformers

Intended Use

Use this adapter for high-precision recognition of Devanagari text from cropped handwritten word images. It is particularly optimized for the IIIT-INDIC-HW-WORDS distribution but generalizes well to other handwritten Devanagari styles.

This model is an OCR engine for word-level recognition. It does not perform page layout analysis, table extraction, or full-page segmentation.

Loading and Inference

To use this model, you need transformers, peft, and torch installed.

from peft import PeftModel
from transformers import AutoTokenizer, TrOCRProcessor, ViTImageProcessor, VisionEncoderDecoderModel
import torch
from PIL import Image

base_model_id = "paudelanil/trocr-devanagari-2"
adapter_id = "manishwagle/devgen-trocr-devanagari-lora" # Or local path

# Load processors
image_processor = ViTImageProcessor.from_pretrained(base_model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
processor = TrOCRProcessor(image_processor=image_processor, tokenizer=tokenizer)

# Load model
device = "cuda" if torch.cuda.is_available() else "cpu"
base_model = VisionEncoderDecoderModel.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(base_model, adapter_id)
model.to(device)

# Inference
image = Image.open("sample_handwritten_word.png").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values.to(device)

generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(f"Recognized Text: {generated_text}")

Demo

A hosted Gradio demo and full DevGen OCR Suite interface is available in the Hugging Face Space and the project repository: DevGen Project Repository.

Limitations

Rotation: Sensitivity to severe text rotation (beyond 15 degrees).
Noise: Performance may degrade on heavily blurred or low-contrast scans.
Script: Optimized for Devanagari; may not work for other Indic scripts without additional tuning.

Framework Versions

PEFT 0.19.1
Transformers 4.40.0+
PyTorch 2.1.0+

Downloads last month: 5

Model tree for manishw10/devgen-trocr-devanagari-lora

Base model

amitness/roberta-base-ne

Finetuned

paudelanil/trocr-devanagari-2

Adapter

(2)

this model

manishw10
/

devgen-trocr-devanagari-lora