--- language: - mr - en base_model: microsoft/Phi-3-mini-4k-instruct tags: - marathi - indian-language - lora - peft - fine-tuned - education - kids license: mit --- # 🌸 Marathi Mitra — माझा मराठी मित्र Fine-tuned Phi-3 Mini for Marathi vocabulary learning, built as a personalized tool to help my daughter learn Marathi. ## Model Details | Property | Value | |----------|-------| | Base Model | microsoft/Phi-3-mini-4k-instruct | | Fine-tuning Method | QLoRA (SFT) | | LoRA Rank | r=32, alpha=64 | | Training Examples | 30 Marathi vocabulary items | | Best Experiment | exp4_lr2e4_epochs25_r32 | | Format Score | 36.4% | | Training Hardware | Google Colab T4 GPU | ## What It Does Given an English word, generates a Marathi lesson with: - Marathi word in Devanagari script - Pronunciation guide - Example sentence - Fun fact for kids ## How to Use ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch base = AutoModelForCausalLM.from_pretrained( "microsoft/Phi-3-mini-4k-instruct", torch_dtype=torch.float16, trust_remote_code=True, ) model = PeftModel.from_pretrained(base, "ninadp/marathi-mitra-phi3") tokenizer = AutoTokenizer.from_pretrained( "microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True, ) prompt = """### Instruction: You are Marathi Mitra, a friendly Marathi teacher for kids. ### Input: Teach me the Marathi word for: butterfly ### Response: """ inputs = tokenizer(prompt, return_tensors="pt") output = model.generate(**inputs, max_new_tokens=150) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Training Details Fine-tuned using Supervised Fine-Tuning (SFT) with QLoRA on 30 Marathi vocabulary examples across 4 hyperparameter experiments. | Experiment | LR | Epochs | Loss | Score | |------------|-----|--------|------|-------| | Baseline | N/A | N/A | N/A | 11.2% | | Exp1 | 2e-4 | 5 | 1.29 | 12.8% | | Exp2 | 2e-4 | 25 | 0.20 | 28.8% | | Exp3 | 1e-4 | 25 | 0.37 | 16.0% | | Exp4 | 2e-4 | 25 | 0.22 | 36.4% ✅ | ## Limitations - Trained on only 30 examples — vocabulary coverage is limited - May generate incorrect Marathi words for unseen vocabulary - Format learned well; accuracy improves with more data - Retraining with 200+ examples planned ## Live Demo [🚀 Try it on HF Spaces](https://huggingface.co/spaces/ninadp/marathi-mitra) ## GitHub [📦 Full project code](https://github.com/ninadparab/marathi-mitra)