Sanskrit OCR LoRA

LoRA adapter for DeepSeek-OCR, fine-tuned on Sanskrit/Devanagari text recognition.

Training

  • Dataset: snskrt/Sanskrit_OCR_Parallel_Corpus (6,265 image-text pairs)
  • Epochs: 2
  • Final train loss: 4.70
  • LoRA rank: 16, alpha: 16
  • Target modules: q/k/v/o/gate/up/down projections
  • Precision: bf16
  • Hardware: NVIDIA A100 80GB (~2.5 hours)

Usage

pip install torch transformers==4.45.0 peft accelerate easydict addict matplotlib einops
from transformers import AutoModel, AutoProcessor
from peft import PeftModel

base_model_path = "deepseek_ocr"  # local path to DeepSeek-OCR
lora_path = "arpitingle/sanskrit-ocr-lora"

model = AutoModel.from_pretrained(base_model_path, trust_remote_code=True, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_path)
model = model.merge_and_unload()
model.eval()

processor = AutoProcessor.from_pretrained(base_model_path, trust_remote_code=True)

result = model.infer(processor, prompt="<image>\nFree OCR. ",
    image_file="path/to/image.jpg", output_path="./output",
    base_size=1024, image_size=640, crop_mode=True,
    save_results=False, test_compress=False, eval_mode=True)

print(result)

Note: crop_mode=True is required for good results.

Files

  • adapter_model.safetensors — LoRA weights
  • adapter_config.json — LoRA config
  • inference.py — inference script
  • train.py — training script

Known Issues

  • Base model requires transformers==4.45.0 (newer versions break custom model code)
  • crop_mode=False produces poor output; always use crop_mode=True
Downloads last month
58
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for arpitingle/sanskrit-ocr-lora

Adapter
(26)
this model