Yash0707's picture
Upload README.md with huggingface_hub
df122be verified
metadata
library_name: peft
base_model: Qwen/Qwen3-8B
license: apache-2.0
language:
  - en
tags:
  - math
  - tutoring
  - education
  - lora
  - qlora
  - qwen3
  - unsloth
datasets:
  - Yash0707/mathtutorqwen3-8b-data
pipeline_tag: text-generation

MathTutor-Qwen3-8B-QLoRA

A QLoRA adapter that turns Qwen3-8B into a K-12 math tutor for algebra, word problems, and arithmetic reasoning.

What it does

  • Teaches step-by-step with Goal / Key Idea / Steps / Worked Example / Checkpoint structure
  • Handles student misconceptions, confusion, and frustration
  • Redirects off-topic questions back to math
  • 41% more concise than base Qwen3-8B while scoring higher on all evaluation metrics

Training

Parameter Value
Base model Qwen3-8B (4-bit QLoRA via Unsloth)
LoRA rank / alpha 16 / 16
NEFTune noise α=5
Dataset 612 examples (426 seed-based tutoring + 100 OOD refusals + 86 context-free)
Data sources GSM8K (MIT), Orca-Math (MIT), synthetic via Claude Sonnet 4
Epochs 2
Batch size 16 (effective)
Learning rate 2e-4, cosine schedule
Training time ~4 min on L40S
Trainable params 43.6M (0.83%)

Evaluation (LLM-as-Judge, Claude Sonnet 4)

Metric Base Qwen3-8B MathTutor (this model) Delta
Correctness (in-domain) 4.79 4.88 +0.09
Pedagogy (in-domain) 4.52 4.56 +0.04
Structure 4.80 4.86 +0.06
Faithfulness 4.62 4.80 +0.18
Refusal (OOD) 3.14 4.12 +0.98

Full evaluation results and methodology: MathTutor-Evaluation

How to use

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Yash0707/MathTutor-Qwen3-8B-QLoRA",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

messages = [
    {"role": "system", "content": "You are MathTutor, an expert K-12 math tutor specializing in algebra, word problems, and arithmetic reasoning."},
    {"role": "user", "content": "I think 2(x+3) = 2x+3. Is that right?"}
]

text = tokenizer.apply_chat_template(
    messages, tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True,
)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.6, use_cache=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Links