Sanskrit OCR LoRA
LoRA adapter for DeepSeek-OCR, fine-tuned on Sanskrit/Devanagari text recognition.
Training
- Dataset: snskrt/Sanskrit_OCR_Parallel_Corpus (6,265 image-text pairs)
- Epochs: 2
- Final train loss: 4.70
- LoRA rank: 16, alpha: 16
- Target modules: q/k/v/o/gate/up/down projections
- Precision: bf16
- Hardware: NVIDIA A100 80GB (~2.5 hours)
Usage
pip install torch transformers==4.45.0 peft accelerate easydict addict matplotlib einops
from transformers import AutoModel, AutoProcessor
from peft import PeftModel
base_model_path = "deepseek_ocr" # local path to DeepSeek-OCR
lora_path = "arpitingle/sanskrit-ocr-lora"
model = AutoModel.from_pretrained(base_model_path, trust_remote_code=True, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_path)
model = model.merge_and_unload()
model.eval()
processor = AutoProcessor.from_pretrained(base_model_path, trust_remote_code=True)
result = model.infer(processor, prompt="<image>\nFree OCR. ",
image_file="path/to/image.jpg", output_path="./output",
base_size=1024, image_size=640, crop_mode=True,
save_results=False, test_compress=False, eval_mode=True)
print(result)
Note: crop_mode=True is required for good results.
Files
adapter_model.safetensors— LoRA weightsadapter_config.json— LoRA configinference.py— inference scripttrain.py— training script
Known Issues
- Base model requires
transformers==4.45.0(newer versions break custom model code) crop_mode=Falseproduces poor output; always usecrop_mode=True
- Downloads last month
- 58
Model tree for arpitingle/sanskrit-ocr-lora
Base model
deepseek-ai/DeepSeek-OCR