--- license: cc-by-nc-4.0 datasets: - openai/gsm8k language: - en base_model: - Qwen/Qwen2.5-Math-1.5B pipeline_tag: text-generation library_name: transformers tags: - math - qwen - lora - mathematics - gsm8k --- # OpenMath Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning ## Overview OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA). This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter. The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch. --- ## Base Model Qwen/Qwen2.5-Math-1.5B This repository does not contain the base model weights — they must be loaded from Hugging Face. --- ## Hardware Used (Latest Training Run) GPU: AMD MI300X (ROCm 7.0) VRAM: 192 GB Operating System: Ubuntu 24.04 Framework: PyTorch + Hugging Face Backend: ROCm --- ## Dataset GSM8K (Grade School Math 8K) Training samples: 1,000 Evaluation: Full GSM8K test split (1,319 problems) Only the solution portion of each example was used for loss computation through loss masking. --- ## Training Configuration Method: QLoRA (4-bit) Quantization: NF4 with float16 compute LoRA rank: 16 LoRA alpha: 32 LoRA dropout: 0.05 Target modules: q_proj, k_proj, v_proj, o_proj Max sequence length: 1024 Batch size: 1 Gradient accumulation: 16 Effective batch size: 16 Learning rate: 1e-4 Optimizer: paged_adamw_8bit Scheduler: cosine Warmup: 5 percent Epochs: 6 --- ## Results GSM8K Accuracy (Full Test Set): 750 out of 1319 correct, which equals 56.86 percent accuracy. This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA. --- ## What This Repository Contains adapter_model.safetensors — LoRA weights adapter_config.json — LoRA configuration chat_template.jinja — chat formatting template tokenizer.json — tokenizer file tokenizer_config.json — tokenizer settings README.md — documentation This repository does not include checkpoints, optimizer states, or full base model weights. --- ## How to Use This Model Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section. --- ## Why This Matters This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA. It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning. It provides a lightweight adapter instead of requiring users to download a massive full model. --- ## Limitations The model can make reasoning mistakes. It should not be used for exams, assignments, or professional decisions. Performance depends heavily on prompt formatting. --- ## Future Work Train on 3,000 to 5,000 GSM8K samples. Add SVAMP and ASDiv datasets. Improve decoding to reduce repetition. Experiment with multi-GPU scaling on MI300X. Add a Streamlit demo for interactive use. --- ## License cc-by-nc-4.0