marathi-mitra-phi3 / README.md
ninadp's picture
Update README.md
ad1d35e verified
metadata
language:
  - mr
  - en
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
  - marathi
  - indian-language
  - lora
  - peft
  - fine-tuned
  - education
  - kids
license: mit

🌸 Marathi Mitra — माझा मराठी मित्र

Fine-tuned Phi-3 Mini for Marathi vocabulary learning, built as a personalized tool to help my daughter learn Marathi.

Model Details

Property Value
Base Model microsoft/Phi-3-mini-4k-instruct
Fine-tuning Method QLoRA (SFT)
LoRA Rank r=32, alpha=64
Training Examples 30 Marathi vocabulary items
Best Experiment exp4_lr2e4_epochs25_r32
Format Score 36.4%
Training Hardware Google Colab T4 GPU

What It Does

Given an English word, generates a Marathi lesson with:

  • Marathi word in Devanagari script
  • Pronunciation guide
  • Example sentence
  • Fun fact for kids

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    torch_dtype=torch.float16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, "ninadp/marathi-mitra-phi3")
tokenizer = AutoTokenizer.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    trust_remote_code=True,
)

prompt = """### Instruction:
You are Marathi Mitra, a friendly Marathi teacher for kids.

### Input:
Teach me the Marathi word for: butterfly

### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Fine-tuned using Supervised Fine-Tuning (SFT) with QLoRA on 30 Marathi vocabulary examples across 4 hyperparameter experiments.

Experiment LR Epochs Loss Score
Baseline N/A N/A N/A 11.2%
Exp1 2e-4 5 1.29 12.8%
Exp2 2e-4 25 0.20 28.8%
Exp3 1e-4 25 0.37 16.0%
Exp4 2e-4 25 0.22 36.4% ✅

Limitations

  • Trained on only 30 examples — vocabulary coverage is limited
  • May generate incorrect Marathi words for unseen vocabulary
  • Format learned well; accuracy improves with more data
  • Retraining with 200+ examples planned

Live Demo

🚀 Try it on HF Spaces

GitHub

📦 Full project code