🇮🇳 Hinglish Qwen2.5-7B Instruct — LoRA Fine-Tune

A bilingual (Hindi-English / Hinglish) fine-tune of Qwen2.5-7B-Instruct using QLoRA + Unsloth, designed to fluently converse in Hinglish — the natural code-mixed language spoken by 350M+ people across India.

⚡ Key Results

Metric Baseline Fine-tuned Change
Eval Loss 2.57 0.90 ↓ 65%
Training Loss 1.63 0.25
Hinglish Response Rate 10% 80% ↑ 8×

🔍 Before vs After

Baseline (Qwen2.5-7B) Fine-tuned (Ours)
"Yaar, Python seekhni hai" Pure English bullet points, broken numbering Natural Hinglish advice like a friend would give
"Ghar pe pizza banana sikhao" Textbook English recipe format Casual Hinglish with Indian context ("Domino's se bhi better!")
"Stock market invest karna chahiye?" Generic English financial advice Practical Hinglish with real talk ("Loan lekar mat karna!")

Model Details

Field Value
Base Model Qwen2.5-7B-Instruct (4-bit NF4)
Method QLoRA — LoRA adapters on quantized base
Framework Unsloth + HuggingFace TRL SFTTrainer
Hardware NVIDIA RTX 3060 (12GB VRAM)
Precision BF16 mixed precision
LoRA Config r=32, α=64, RSLoRA, 7 target modules
Trainable Params 80.7M / 7.7B (1.05%)
Training Time ~2 hours, 675 steps, 3 epochs

Training

Parameter Value
Epochs 3 (675 total steps)
Effective Batch Size 16 (1 × 16 grad accum)
Learning Rate 2e-4 → cosine decay
Warmup 50 steps
Optimizer AdamW 8-bit
Eval Strategy Every 100 steps
Best Checkpoint Step 400 (eval loss 0.899)

Loss Curve

Training Loss Curve


Dataset

4,000 curated samples from three sources:

Source Samples
HuggingFace (HydraLM, NebulaByte, findnitai) ~2,000
Gemini Synthetic (5 categories) ~1,000
Alpaca → Hinglish Translation ~1,000

Format: ChatML with Hinglish system prompt. Split: 90% train / 10% eval.


Usage

from unsloth import FastLanguageModel
from peft import PeftModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Qwen2.5-7B-Instruct-bnb-4bit",
    max_seq_length=2048,
    load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "psyka-101/hinglish-qwen2.5-7b-instruct")
FastLanguageModel.for_inference(model)

messages = [
    {"role": "system", "content": "Aap ek helpful AI assistant hain jo Hinglish mein baat kar sakta hai."},
    {"role": "user", "content": "Yaar, mujhe Python seekhni hai. Kahan se start karun?"},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

  • Trained on ~4K samples — may not generalize to all dialects or specialized topics
  • Checkpoint-400 recommended (eval loss rises in epoch 3)
  • May still default to pure English on some prompts
  • Roman-script Hinglish only — no Devanagari
  • Not tested for safety/toxicity

License

Inherits the Qwen2.5 license.


Built with 🦥 Unsloth • 🤗 Transformers • 🇮🇳 Hinglish

Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for psyka-101/hinglish-qwen2.5-7b-instruct

Base model

Qwen/Qwen2.5-7B
Adapter
(1676)
this model

Datasets used to train psyka-101/hinglish-qwen2.5-7b-instruct

Evaluation results