Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
pankajpandey-devย 
posted an update 1 day ago
Post
11352
๐Ÿ‡ฎ๐Ÿ‡ณ Qwen3-4B Hindi Instruct v2 โ€” a Hindi LLM that runs on your own machine
Most strong Hindi-capable models are either huge or cloud-only. I wanted one that's small enough to run locally but actually follows instructions in Hindi โ€” so I fine-tuned Qwen3-4B on 10K Hindi instruction pairs and shipped it with a full GGUF quant ladder.
โœ… Fine-tune (16-bit): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2
โœ… GGUF (Q4/Q5/Q8): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2-GGUF
Runs in Ollama, llama.cpp, and LM Studio. The Q4_K_M is just 2.5 GB โ€” fits comfortably on a laptop, CPU or GPU.
Part of my Hindi LLM Series โ€” building openly-licensed Indic models for local and edge use. More coming (Gemma next). Feedback welcome ๐Ÿ™
#Hindi #IndicNLP #GGUF #LocalLLM #Qwen

How good is the tool calling ?

ยท

Good question! v2 is fine-tuned on Hindi instruction pairs, so there's no tool-calling data in the training set โ€” the focus is Hindi instruction-following. That said, it's built on Qwen3-4B, which has native function-calling support, and since this is a LoRA fine-tune (base weights frozen) that capability should largely carry through. I haven't benchmarked tool calling specifically yet though, so I won't make hard claims. Tool calling is something I plan to evaluate in future iterations, and I'd be happy to hear feedback if anyone tests it.