manishw7
Stability: Reverted to Gradio 3.50.2 with Premium CSS Styling
68beaa2
---
base_model: paudelanil/trocr-devanagari-2
library_name: peft
license: mit
language:
- ne
- hi
pipeline_tag: image-to-text
tags:
- lora
- peft
- transformers
- trocr
- devanagari
- ocr
datasets:
- c3rl/IIIT-INDIC-HW-WORDS-Hindi
sdk: gradio
sdk_version: 3.50.2
python_version: "3.10"
app_file: app.py
pinned: true
---
# DevGen TrOCR Devanagari LoRA Adapter
This repository contains the **DevGen LoRA adapter** for Devanagari OCR. It is designed to be loaded on top of `paudelanil/trocr-devanagari-2` using the PEFT (Parameter-Efficient Fine-Tuning) library.
## Model Details
- **Developed by:** Manish Wagle / DevGen
- **Base model:** `paudelanil/trocr-devanagari-2`
- **Adapter type:** LoRA
- **Task:** Image-to-text OCR for Devanagari handwritten words and short phrases.
- **Library:** PEFT + Transformers
## Intended Use
Use this adapter for high-precision recognition of Devanagari text from cropped handwritten word images. It is particularly optimized for the **IIIT-INDIC-HW-WORDS** distribution but generalizes well to other handwritten Devanagari styles.
> [!IMPORTANT]
> This model is an OCR engine for word-level recognition. It does not perform page layout analysis, table extraction, or full-page segmentation.
## Loading and Inference
To use this model, you need `transformers`, `peft`, and `torch` installed.
```python
from peft import PeftModel
from transformers import AutoTokenizer, TrOCRProcessor, ViTImageProcessor, VisionEncoderDecoderModel
import torch
from PIL import Image
base_model_id = "paudelanil/trocr-devanagari-2"
adapter_id = "manishwagle/devgen-trocr-devanagari-lora" # Or local path
# Load processors
image_processor = ViTImageProcessor.from_pretrained(base_model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
processor = TrOCRProcessor(image_processor=image_processor, tokenizer=tokenizer)
# Load model
device = "cuda" if torch.cuda.is_available() else "cpu"
base_model = VisionEncoderDecoderModel.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(base_model, adapter_id)
model.to(device)
# Inference
image = Image.open("sample_handwritten_word.png").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values.to(device)
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(f"Recognized Text: {generated_text}")
```
## Demo
A hosted Gradio demo and full DevGen OCR Suite interface is available in the [Hugging Face Space](https://huggingface.co/spaces/manishw10/devgen-devanagari-ocr) and the project repository: [DevGen Project Repository](https://github.com/manishwagle/DevGen).
## Limitations
- **Rotation:** Sensitivity to severe text rotation (beyond 15 degrees).
- **Noise:** Performance may degrade on heavily blurred or low-contrast scans.
- **Script:** Optimized for Devanagari; may not work for other Indic scripts without additional tuning.
## Framework Versions
- PEFT 0.19.1
- Transformers 4.40.0+
- PyTorch 2.1.0+