Ravi — Math Tutor (Llama 3.1 8B Instruct)

A fine-tuned version of Llama 3.1 8B Instruct trained to function as Ravi, a math tutoring assistant specializing in algebra, calculus, and word problems. Trained with QLoRA via Unsloth + TRL SFTTrainer on Google Colab T4.

Ravi teaches rather than just answers — it scaffolds understanding, asks checkpoint questions, handles student misconceptions with a 4-tier escalation protocol, and redirects out-of-domain queries.

Model Details

Property	Value
Base model	`unsloth/llama-3.1-8b-instruct-bnb-4bit`
Fine-tuning method	QLoRA (4-bit NF4 quantization)
LoRA rank	16
LoRA alpha	16
LoRA dropout	0
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training steps	200
Learning rate	2e-4 (cosine schedule, 10 warmup steps)
Effective batch size	8 (2 per device × 4 gradient accumulation)
Optimizer	AdamW 8-bit
Max sequence length	2,048 tokens
Packing	Disabled
Loss masking	Student turns masked (-100); model trains on teacher responses only
Platform	Google Colab T4 (15GB VRAM, ~13.6GB used)
Framework	Unsloth + PEFT + TRL SFTTrainer

Dataset

Training data: Sai345/math-tutor-sft-dataset

Source	Examples	Description
MathDial (`eth-nlped/mathdial`)	1,696	Multi-turn math tutoring dialogues. Filtered to `self_correctness == "Yes"` only. Converted from pipe-delimited format to Llama 3.1 chat template. Teacher tags stripped. License: CC-BY-SA 4.0
Synthetic (Groq Llama 4 Scout 17B)	455	5 typed categories: algebra scaffolding (150), word problem scaffolding (80), misconception correction (120), difficulty adaptation (80), OOD refusal (25). All follow the Ravi persona and 4-tier escalation protocol.
Total	2,151	1,935 train / 216 test (90/10 split)

Each example is a full multi-turn conversation formatted as a single training instance with the system prompt embedded.

Usage

Requirements

pip install transformers peft bitsandbytes accelerate

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Sai345
/

llama-3.1-8b-math-tutor

Ravi — Math Tutor (Llama 3.1 8B Instruct)

Model Details

Dataset

Usage

Requirements

Dataset used to train Sai345/llama-3.1-8b-math-tutor