File size: 3,022 Bytes
e74b0f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68beaa2
b80a0db
e74b0f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89f6f49
e74b0f8
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
base_model: paudelanil/trocr-devanagari-2
library_name: peft
license: mit
language:
- ne
- hi
pipeline_tag: image-to-text
tags:
- lora
- peft
- transformers
- trocr
- devanagari
- ocr
datasets:
- c3rl/IIIT-INDIC-HW-WORDS-Hindi
sdk: gradio
sdk_version: 3.50.2
python_version: "3.10"
app_file: app.py
pinned: true
---

# DevGen TrOCR Devanagari LoRA Adapter

This repository contains the **DevGen LoRA adapter** for Devanagari OCR. It is designed to be loaded on top of `paudelanil/trocr-devanagari-2` using the PEFT (Parameter-Efficient Fine-Tuning) library.

## Model Details

- **Developed by:** Manish Wagle / DevGen
- **Base model:** `paudelanil/trocr-devanagari-2`
- **Adapter type:** LoRA
- **Task:** Image-to-text OCR for Devanagari handwritten words and short phrases.
- **Library:** PEFT + Transformers

## Intended Use

Use this adapter for high-precision recognition of Devanagari text from cropped handwritten word images. It is particularly optimized for the **IIIT-INDIC-HW-WORDS** distribution but generalizes well to other handwritten Devanagari styles.

> [!IMPORTANT]
> This model is an OCR engine for word-level recognition. It does not perform page layout analysis, table extraction, or full-page segmentation.

## Loading and Inference

To use this model, you need `transformers`, `peft`, and `torch` installed.

```python
from peft import PeftModel
from transformers import AutoTokenizer, TrOCRProcessor, ViTImageProcessor, VisionEncoderDecoderModel
import torch
from PIL import Image

base_model_id = "paudelanil/trocr-devanagari-2"
adapter_id = "manishwagle/devgen-trocr-devanagari-lora" # Or local path

# Load processors
image_processor = ViTImageProcessor.from_pretrained(base_model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
processor = TrOCRProcessor(image_processor=image_processor, tokenizer=tokenizer)

# Load model
device = "cuda" if torch.cuda.is_available() else "cpu"
base_model = VisionEncoderDecoderModel.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(base_model, adapter_id)
model.to(device)

# Inference
image = Image.open("sample_handwritten_word.png").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values.to(device)

generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(f"Recognized Text: {generated_text}")
```

## Demo

A hosted Gradio demo and full DevGen OCR Suite interface is available in the [Hugging Face Space](https://huggingface.co/spaces/manishw10/devgen-devanagari-ocr) and the project repository: [DevGen Project Repository](https://github.com/manishwagle/DevGen).

## Limitations

- **Rotation:** Sensitivity to severe text rotation (beyond 15 degrees).
- **Noise:** Performance may degrade on heavily blurred or low-contrast scans.
- **Script:** Optimized for Devanagari; may not work for other Indic scripts without additional tuning.

## Framework Versions

- PEFT 0.19.1
- Transformers 4.40.0+
- PyTorch 2.1.0+