lmms-lab/DocVQA
Viewer • Updated • 16.6k • 33.5k • 79
Fine-tuned vision-language model for Indic languages based on Qwen3-VL-4B-Instruct. This is the LoRA adapter that needs to be merged with the base model.
Trained on 4 datasets covering:
Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Maithili, Malayalam, Marathi, Manipuri, Nepali, Odia, Punjabi, Sanskrit, Santali, Sindhi, Tamil, Telugu, Urdu, English
from unsloth import FastVisionModel
model, tokenizer = FastVisionModel.from_pretrained(
"Qwen/Qwen3-VL-4B-Instruct",
load_in_4bit=True,
)
# Load LoRA adapter
model.load_adapter("mashriram/Sarvam-1-VL-4B-Instruct")
# Use for inference
Apache 2.0
If you use this model, please cite the original Qwen3-VL paper and the datasets used.
Base model
Qwen/Qwen3-VL-4B-Instruct