--- language: - en license: apache-2.0 base_model: google/gemma-2b-it tags: - peft - qlora - tutor - python - sql - dsa --- # AI Programming Tutor (Gemma 2B - Fine-Tuned) This model is a fine-tuned version of `google/gemma-2b-it` designed to act as an expert Agentic programming tutor. It was developed as part of an AI/ML Engineer assessment for Purple Merit Technologies. Rather than simply giving users the answer, this model is trained to teach concepts using a strict pedagogical structure. ## 🧠 Pedagogical Structure The model enforces the following flow for every response: 1. **Goal**: States what the student will learn. 2. **Concept**: Explains the intuition behind the topic. 3. **Worked Example**: Provides step-by-step code with comments. 4. **Common Mistakes**: Highlights typical errors students make. 5. **Checkpoint**: Asks a guiding question to verify understanding. It is also trained to gracefully redirect out-of-scope requests (e.g., calculus or diet advice) back to programming topics. ## 🛠️ Training Details * **Base Model:** `google/gemma-2b-it` * **Technique:** QLoRA (4-bit NF4 quantization) * **Dataset:** A curated mix of synthetic tutoring conversations and a filtered Python/SQL subset of CodeAlpaca-20k. * **Infrastructure:** Trained on a single NVIDIA T4 GPU. ## 📊 Evaluation The model was evaluated on a held-out set of 30 prompts (25 in-domain, 5 out-of-domain). * **Perplexity Improvement:** The fine-tuned model achieved a perplexity of **2.09**, a **95.1% improvement** over the base model's 43.05 on the same tutoring dataset. ## 💻 How to Use (Inference) You can load this model using `transformers` and `peft`. *Note: Ensure you load the model in `float16` if running on a T4 GPU to prevent memory access errors.* ```python from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel import torch BASE = "google/gemma-2b-it" ADAPTER = "Imrozkhan007/programming-tutor-gemma-2b" # Use float16 for T4 compatibility bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type='nf4', bnb_4bit_compute_dtype=torch.float16 ) tokenizer = AutoTokenizer.from_pretrained(BASE) base_model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb_config, device_map={"": 0}) model = PeftModel.from_pretrained(base_model, ADAPTER) model.eval() # Example Prompt prompt = "Explain binary search step by step" messages = [ {"role": "user", "content": f"You are an expert programming tutor...\n\nStudent Question: {prompt}"} ] formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(formatted_prompt, return_tensors='pt').to(model.device) out = model.generate(**inputs, max_new_tokens=512, temperature=0.3) print(tokenizer.decode(out[0], skip_special_tokens=True))