MathTutor-Qwen3-8B-QLoRA

A QLoRA adapter that turns Qwen3-8B into a K-12 math tutor for algebra, word problems, and arithmetic reasoning.

What it does

Teaches step-by-step with Goal / Key Idea / Steps / Worked Example / Checkpoint structure
Handles student misconceptions, confusion, and frustration
Redirects off-topic questions back to math
41% more concise than base Qwen3-8B while scoring higher on all evaluation metrics

Training

Parameter	Value
Base model	Qwen3-8B (4-bit QLoRA via Unsloth)
LoRA rank / alpha	16 / 16
NEFTune noise	α=5
Dataset	612 examples (426 seed-based tutoring + 100 OOD refusals + 86 context-free)
Data sources	GSM8K (MIT), Orca-Math (MIT), synthetic via Claude Sonnet 4
Epochs	2
Batch size	16 (effective)
Learning rate	2e-4, cosine schedule
Training time	~4 min on L40S
Trainable params	43.6M (0.83%)

Evaluation (LLM-as-Judge, Claude Sonnet 4)

Metric	Base Qwen3-8B	MathTutor (this model)	Delta
Correctness (in-domain)	4.79	4.88	+0.09
Pedagogy (in-domain)	4.52	4.56	+0.04
Structure	4.80	4.86	+0.06
Faithfulness	4.62	4.80	+0.18
Refusal (OOD)	3.14	4.12	+0.98

Full evaluation results and methodology: MathTutor-Evaluation

How to use

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Yash0707/MathTutor-Qwen3-8B-QLoRA",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

messages = [
    {"role": "system", "content": "You are MathTutor, an expert K-12 math tutor specializing in algebra, word problems, and arithmetic reasoning."},
    {"role": "user", "content": "I think 2(x+3) = 2x+3. Is that right?"}
]

text = tokenizer.apply_chat_template(
    messages, tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True,
)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.6, use_cache=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))