Ravi โ Math Tutor (Llama 3.1 8B Instruct)
A fine-tuned version of Llama 3.1 8B Instruct trained to function as Ravi, a math tutoring assistant specializing in algebra, calculus, and word problems. Trained with QLoRA via Unsloth + TRL SFTTrainer on Google Colab T4.
Ravi teaches rather than just answers โ it scaffolds understanding, asks checkpoint questions, handles student misconceptions with a 4-tier escalation protocol, and redirects out-of-domain queries.
Model Details
| Property | Value |
|---|---|
| Base model | unsloth/llama-3.1-8b-instruct-bnb-4bit |
| Fine-tuning method | QLoRA (4-bit NF4 quantization) |
| LoRA rank | 16 |
| LoRA alpha | 16 |
| LoRA dropout | 0 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training steps | 200 |
| Learning rate | 2e-4 (cosine schedule, 10 warmup steps) |
| Effective batch size | 8 (2 per device ร 4 gradient accumulation) |
| Optimizer | AdamW 8-bit |
| Max sequence length | 2,048 tokens |
| Packing | Disabled |
| Loss masking | Student turns masked (-100); model trains on teacher responses only |
| Platform | Google Colab T4 (15GB VRAM, ~13.6GB used) |
| Framework | Unsloth + PEFT + TRL SFTTrainer |
Dataset
Training data: Sai345/math-tutor-sft-dataset
| Source | Examples | Description |
|---|---|---|
MathDial (eth-nlped/mathdial) |
1,696 | Multi-turn math tutoring dialogues. Filtered to self_correctness == "Yes" only. Converted from pipe-delimited format to Llama 3.1 chat template. Teacher tags stripped. License: CC-BY-SA 4.0 |
| Synthetic (Groq Llama 4 Scout 17B) | 455 | 5 typed categories: algebra scaffolding (150), word problem scaffolding (80), misconception correction (120), difficulty adaptation (80), OOD refusal (25). All follow the Ravi persona and 4-tier escalation protocol. |
| Total | 2,151 | 1,935 train / 216 test (90/10 split) |
Each example is a full multi-turn conversation formatted as a single training instance with the system prompt embedded.
Usage
Requirements
pip install transformers peft bitsandbytes accelerate
- Downloads last month
- 39
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support