🇮🇳 Hinglish Qwen2.5-7B Instruct — LoRA Fine-Tune
A bilingual (Hindi-English / Hinglish) fine-tune of Qwen2.5-7B-Instruct using QLoRA + Unsloth, designed to fluently converse in Hinglish — the natural code-mixed language spoken by 350M+ people across India.
⚡ Key Results
| Metric | Baseline | Fine-tuned | Change |
|---|---|---|---|
| Eval Loss | 2.57 | 0.90 | ↓ 65% |
| Training Loss | 1.63 | 0.25 | — |
| Hinglish Response Rate | 10% | 80% | ↑ 8× |
🔍 Before vs After
| Baseline (Qwen2.5-7B) | Fine-tuned (Ours) | |
|---|---|---|
| "Yaar, Python seekhni hai" | Pure English bullet points, broken numbering | Natural Hinglish advice like a friend would give |
| "Ghar pe pizza banana sikhao" | Textbook English recipe format | Casual Hinglish with Indian context ("Domino's se bhi better!") |
| "Stock market invest karna chahiye?" | Generic English financial advice | Practical Hinglish with real talk ("Loan lekar mat karna!") |
Model Details
| Field | Value |
|---|---|
| Base Model | Qwen2.5-7B-Instruct (4-bit NF4) |
| Method | QLoRA — LoRA adapters on quantized base |
| Framework | Unsloth + HuggingFace TRL SFTTrainer |
| Hardware | NVIDIA RTX 3060 (12GB VRAM) |
| Precision | BF16 mixed precision |
| LoRA Config | r=32, α=64, RSLoRA, 7 target modules |
| Trainable Params | 80.7M / 7.7B (1.05%) |
| Training Time | ~2 hours, 675 steps, 3 epochs |
Training
| Parameter | Value |
|---|---|
| Epochs | 3 (675 total steps) |
| Effective Batch Size | 16 (1 × 16 grad accum) |
| Learning Rate | 2e-4 → cosine decay |
| Warmup | 50 steps |
| Optimizer | AdamW 8-bit |
| Eval Strategy | Every 100 steps |
| Best Checkpoint | Step 400 (eval loss 0.899) |
Loss Curve
Dataset
4,000 curated samples from three sources:
| Source | Samples |
|---|---|
| HuggingFace (HydraLM, NebulaByte, findnitai) | ~2,000 |
| Gemini Synthetic (5 categories) | ~1,000 |
| Alpaca → Hinglish Translation | ~1,000 |
Format: ChatML with Hinglish system prompt. Split: 90% train / 10% eval.
Usage
from unsloth import FastLanguageModel
from peft import PeftModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/Qwen2.5-7B-Instruct-bnb-4bit",
max_seq_length=2048,
load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "psyka-101/hinglish-qwen2.5-7b-instruct")
FastLanguageModel.for_inference(model)
messages = [
{"role": "system", "content": "Aap ek helpful AI assistant hain jo Hinglish mein baat kar sakta hai."},
{"role": "user", "content": "Yaar, mujhe Python seekhni hai. Kahan se start karun?"},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations
- Trained on ~4K samples — may not generalize to all dialects or specialized topics
- Checkpoint-400 recommended (eval loss rises in epoch 3)
- May still default to pure English on some prompts
- Roman-script Hinglish only — no Devanagari
- Not tested for safety/toxicity
License
Inherits the Qwen2.5 license.
Built with 🦥 Unsloth • 🤗 Transformers • 🇮🇳 Hinglish
- Downloads last month
- 33
Model tree for psyka-101/hinglish-qwen2.5-7b-instruct
Datasets used to train psyka-101/hinglish-qwen2.5-7b-instruct
Evaluation results
- Eval Lossself-reported0.899
- Training Loss (final)self-reported0.250
