Instructions to use manishw10/devgen-trocr-devanagari-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use manishw10/devgen-trocr-devanagari-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForSeq2SeqLM base_model = AutoModelForSeq2SeqLM.from_pretrained("paudelanil/trocr-devanagari-2") model = PeftModel.from_pretrained(base_model, "manishw10/devgen-trocr-devanagari-lora") - Transformers
How to use manishw10/devgen-trocr-devanagari-lora with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="manishw10/devgen-trocr-devanagari-lora")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("manishw10/devgen-trocr-devanagari-lora", dtype="auto") - Notebooks
- Google Colab
- Kaggle
DevGen TrOCR Devanagari LoRA Adapter
This repository contains the DevGen LoRA adapter for Devanagari OCR. It is designed to be loaded on top of paudelanil/trocr-devanagari-2 using the PEFT (Parameter-Efficient Fine-Tuning) library.
Model Details
- Developed by: Manish Wagle / DevGen
- Base model:
paudelanil/trocr-devanagari-2 - Adapter type: LoRA
- Task: Image-to-text OCR for Devanagari handwritten words and short phrases.
- Library: PEFT + Transformers
Intended Use
Use this adapter for high-precision recognition of Devanagari text from cropped handwritten word images. It is particularly optimized for the IIIT-INDIC-HW-WORDS distribution but generalizes well to other handwritten Devanagari styles.
This model is an OCR engine for word-level recognition. It does not perform page layout analysis, table extraction, or full-page segmentation.
Loading and Inference
To use this model, you need transformers, peft, and torch installed.
from peft import PeftModel
from transformers import AutoTokenizer, TrOCRProcessor, ViTImageProcessor, VisionEncoderDecoderModel
import torch
from PIL import Image
base_model_id = "paudelanil/trocr-devanagari-2"
adapter_id = "manishwagle/devgen-trocr-devanagari-lora" # Or local path
# Load processors
image_processor = ViTImageProcessor.from_pretrained(base_model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
processor = TrOCRProcessor(image_processor=image_processor, tokenizer=tokenizer)
# Load model
device = "cuda" if torch.cuda.is_available() else "cpu"
base_model = VisionEncoderDecoderModel.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(base_model, adapter_id)
model.to(device)
# Inference
image = Image.open("sample_handwritten_word.png").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values.to(device)
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(f"Recognized Text: {generated_text}")
Demo
A hosted Gradio demo and full DevGen OCR Suite interface is available in the Hugging Face Space and the project repository: DevGen Project Repository.
Limitations
- Rotation: Sensitivity to severe text rotation (beyond 15 degrees).
- Noise: Performance may degrade on heavily blurred or low-contrast scans.
- Script: Optimized for Devanagari; may not work for other Indic scripts without additional tuning.
Framework Versions
- PEFT 0.19.1
- Transformers 4.40.0+
- PyTorch 2.1.0+
- Downloads last month
- 453