Math tutor — QLoRA merged weights (Modal training)

This directory holds merged full-precision (bf16) weights after a QLoRA fine-tune run on Modal, produced by scripts/train_qlora_modal.py. The base model is TinyLlama/TinyLlama-1.1B-Chat-v1.0 unless you overrode --base-model.

Training method (QLoRA)

Training on Modal uses the standard QLoRA stack:

  • 4-bit quantization of the base model (NF4, double quantization, bf16 compute).
  • prepare_model_for_kbit_training then LoRA (PEFT) on attention projections q_proj, k_proj, v_proj, o_proj (defaults: r=8, alpha=16, dropout 0.05).
  • 8-bit paged AdamW optimizer during SFT.
  • After training, the adapter is merged into the base and saved as a normal causal LM checkpoint (this folder’s config.json, tokenizer files, and weight shards if present).

So: yes — the Modal job is QLoRA, not full fine-tuning of all base weights.

Data

Instruction rows are built inside the training image from the project curriculum via build_instruction_set in scripts/train_qlora.py: synthetic tutor-style turns in English, French, and Kinyarwanda derived from the numeracy items (on the order of ~684 JSONL records for the default seed).

How these files got here

  1. Run on Modal (example):

    modal run scripts/train_qlora_modal.py
    
  2. Pull the merged checkpoint from the math-tutor-checkpoints volume:

    modal volume get math-tutor-checkpoints /math_tutor_merged ./checkpoints/math_tutor_merged
    

    If your local layout matches this repo, the merged weights and tokenizer should end up under checkpoints/ (or checkpoints/math_tutor_merged/ — copy or symlink so this README.md sits next to the Hub upload).

Loading (Transformers)

Replace paths with your actual folder or Hub repo id.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

path = "."  # or "your-username/your-repo"
tok = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    path,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

Pushing to Hugging Face Hub

Either pass --push-to your-username/repo-name with HF_TOKEN set when running Modal, or upload this folder after training:

huggingface-cli upload your-username/your-repo . --repo-type model

Use this file as the repo README.md on the Hub (same content is valid as the model card).

Limits

  • Intended as a small numeracy / feedback-style language head, not general chat.
  • Merged weights are not int4 GGUF; GGUF export is a separate step (llama.cpp convert/quantize) if you need that format.
  • Base model and dataset licenses apply in addition to this project’s MIT license for the training code and generated adapter/merge recipe.

Citation

If you use this checkpoint, cite the TinyLlama base model and link your Hub repo or this project’s repository as appropriate.

Downloads last month
-
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ahmed5/AIMS_KTT_Day3

Adapter
(1497)
this model