@pankajpandey-dev on Hugging Face: "🇮🇳 Qwen3-4B Hindi Instruct v2 — a Hindi LLM that runs on your own machine…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

posted an update May 30

Post

14993

🇮🇳 Qwen3-4B Hindi Instruct v2 — a Hindi LLM that runs on your own machine
Most strong Hindi-capable models are either huge or cloud-only. I wanted one that's small enough to run locally but actually follows instructions in Hindi — so I fine-tuned Qwen3-4B on 10K Hindi instruction pairs and shipped it with a full GGUF quant ladder.
✅ Fine-tune (16-bit): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2
✅ GGUF (Q4/Q5/Q8): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2-GGUF
Runs in Ollama, llama.cpp, and LM Studio. The Q4_K_M is just 2.5 GB — fits comfortably on a laptop, CPU or GPU.
Part of my Hindi LLM Series — building openly-licensed Indic models for local and edge use. More coming (Gemma next). Feedback welcome 🙏
#Hindi #IndicNLP #GGUF #LocalLLM #Qwen

vedanshu93

May 31

How good is the tool calling ?

pankajpandey-dev

May 31

Good question! v2 is fine-tuned on Hindi instruction pairs, so there's no tool-calling data in the training set — the focus is Hindi instruction-following. That said, it's built on Qwen3-4B, which has native function-calling support, and since this is a LoRA fine-tune (base weights frozen) that capability should largely carry through. I haven't benchmarked tool calling specifically yet though, so I won't make hard claims. Tool calling is something I plan to evaluate in future iterations, and I'd be happy to hear feedback if anyone tests it.

In this post