KTT Math Tutor β Models
Companion model artefacts for the AIMS KTT Hackathon Tier-3 submission S2.T3.1 AI Math Tutor for Early Learners. Source code and training scripts: https://github.com/DrUkachi/ktt-math-tutor.
What's here
| Subfolder / file | Size | Role |
|---|---|---|
whisper-tiny-child-lora-ct2int8/ |
44 MB | child-voice LoRA-tuned Whisper-tiny, merged, CTranslate2 int8 for CPU |
tinyllama-numeracy-qlora-adapter/ |
21 MB | QLoRA adapter (r=16, NF4 base) trained on 200 synthetic numeracy instructions |
tinyllama-numeracy-Q4_K_M.gguf |
637 MB | the adapter merged into TinyLlama-1.1B and quantised to Q4_K_M |
How to use
ASR (child-voice Whisper)
from faster_whisper import WhisperModel
model = WhisperModel("DrUkachi/ktt-math-tutor-models",
device="cpu", compute_type="int8",
local_files_only=False)
segments, _ = model.transcribe(wav, language="en", beam_size=1)
Or, via the tutor's wrapper (auto-picks tutor/asr_model/ from the repo):
git clone https://github.com/DrUkachi/ktt-math-tutor
cd ktt-math-tutor && pip install -r requirements.txt
python demo.py
Eval on the in-distribution child-voice corpus (36 clips, pitched +3/+4.5/+6 semitones):
- Baseline vanilla Whisper-tiny int8: WER 0.7048
- This LoRA-tuned model: WER 0.0000
See scripts/eval_wer.py and metrics/wer_*.json in the code repo.
LLM head (weekly parent summary)
from llama_cpp import Llama
llm = Llama(
model_path="tinyllama-numeracy-Q4_K_M.gguf",
n_ctx=512, n_threads=4, verbose=False,
)
r = llm.create_chat_completion(messages=[
{"role": "system", "content": "You are a warm math tutor. One short sentence."},
{"role": "user", "content": "The child is strong at addition; needs practice on number sense."},
])
Or via the tutor's wrapper (tutor/llm_head.py): the model is resolved
in order $TUTOR_LLM_GGUF β this tuned Q4_K_M β community TinyLlama
base β deterministic fallback. None of the LLM path is in the
inference hot path; it runs once per learner per week for the
voiced parent summary.
Training recipes
- ASR LoRA:
scripts/train_whisper_lora.pyβ 4 epochs on L4 GPU, LoRA r=16 on q_proj/v_proj, merge, export to CT2 int8. - LLM QLoRA:
scripts/train_llm_qlora.pyβ 2 epochs on L4 GPU, NF4 4-bit base, LoRA r=16 on q/k/v/o_proj, merge, convert to GGUF via pinned llama.cpp b4400 script, quantise to Q4_K_M via thellama_cpp.llama_model_quantizePython binding.
License
MIT. Attribution welcomed; not required.
- Downloads last month
- 10
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support