---
language:
- en
license: apache-2.0
base_model: google/gemma-2b-it
tags:
- peft
- qlora
- tutor
- python
- sql
- dsa
---

# AI Programming Tutor (Gemma 2B - Fine-Tuned)

This model is a fine-tuned version of `google/gemma-2b-it` designed to act as an expert Agentic programming tutor. It was developed as part of an AI/ML Engineer assessment for Purple Merit Technologies. 

Rather than simply giving users the answer, this model is trained to teach concepts using a strict pedagogical structure.

## 🧠 Pedagogical Structure
The model enforces the following flow for every response:
1. **Goal**: States what the student will learn.
2. **Concept**: Explains the intuition behind the topic.
3. **Worked Example**: Provides step-by-step code with comments.
4. **Common Mistakes**: Highlights typical errors students make.
5. **Checkpoint**: Asks a guiding question to verify understanding.

It is also trained to gracefully redirect out-of-scope requests (e.g., calculus or diet advice) back to programming topics.

## 🛠️ Training Details
* **Base Model:** `google/gemma-2b-it`
* **Technique:** QLoRA (4-bit NF4 quantization)
* **Dataset:** A curated mix of synthetic tutoring conversations and a filtered Python/SQL subset of CodeAlpaca-20k.
* **Infrastructure:** Trained on a single NVIDIA T4 GPU.

## 📊 Evaluation
The model was evaluated on a held-out set of 30 prompts (25 in-domain, 5 out-of-domain). 
* **Perplexity Improvement:** The fine-tuned model achieved a perplexity of **2.09**, a **95.1% improvement** over the base model's 43.05 on the same tutoring dataset.

## 💻 How to Use (Inference)

You can load this model using `transformers` and `peft`. *Note: Ensure you load the model in `float16` if running on a T4 GPU to prevent memory access errors.*

```python
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

BASE = "google/gemma-2b-it"
ADAPTER = "Imrozkhan007/programming-tutor-gemma-2b"

# Use float16 for T4 compatibility
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, 
    bnb_4bit_quant_type='nf4',
    bnb_4bit_compute_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(BASE)
base_model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb_config, device_map={"": 0})
model = PeftModel.from_pretrained(base_model, ADAPTER)
model.eval()

# Example Prompt
prompt = "Explain binary search step by step"
messages = [
    {"role": "user", "content": f"You are an expert programming tutor...\n\nStudent Question: {prompt}"}
]
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(formatted_prompt, return_tensors='pt').to(model.device)
out = model.generate(**inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(out[0], skip_special_tokens=True))