| --- |
| license: cc-by-nc-4.0 |
| datasets: |
| - openai/gsm8k |
| language: |
| - en |
| base_model: |
| - Qwen/Qwen2.5-Math-1.5B |
| pipeline_tag: text-generation |
| library_name: transformers |
| tags: |
| - math |
| - qwen |
| - lora |
| - mathematics |
| - gsm8k |
| --- |
| |
| # OpenMath |
| Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning |
|
|
| ## Overview |
| OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA). |
|
|
| This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter. |
|
|
| The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch. |
|
|
| --- |
|
|
| ## Base Model |
| Qwen/Qwen2.5-Math-1.5B |
|
|
| This repository does not contain the base model weights β they must be loaded from Hugging Face. |
|
|
| --- |
|
|
| ## Hardware Used (Latest Training Run) |
|
|
| GPU: AMD MI300X (ROCm 7.0) |
| VRAM: 192 GB |
| Operating System: Ubuntu 24.04 |
| Framework: PyTorch + Hugging Face |
| Backend: ROCm |
|
|
| --- |
|
|
| ## Dataset |
|
|
| GSM8K (Grade School Math 8K) |
| Training samples: 1,000 |
| Evaluation: Full GSM8K test split (1,319 problems) |
|
|
| Only the solution portion of each example was used for loss computation through loss masking. |
|
|
| --- |
|
|
| ## Training Configuration |
|
|
| Method: LoRA (full precision, bfloat16) |
| Precision: bfloat16 (no 4-bit quantization) |
| LoRA rank: 16 |
| LoRA alpha: 32 |
| LoRA dropout: 0.05 |
| Target modules: q_proj, k_proj, v_proj, o_proj |
| Max sequence length: 1024 |
| Batch size: 2 |
| Gradient accumulation: 8 |
| Effective batch size: 16 |
| Learning rate: 1e-4 |
| Optimizer: adamw_torch |
| Scheduler: cosine |
| Warmup: 5 percent |
| Epochs: 3 |
| |
| --- |
| |
| ## Results |
| |
| GSM8K Accuracy (Full Test Set): |
| 750 out of 1319 correct, which equals 56.86 percent accuracy. |
| |
| This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA. |
| |
| --- |
| |
| ## What This Repository Contains |
| |
| adapter_model.safetensors β LoRA weights |
| adapter_config.json β LoRA configuration |
| chat_template.jinja β chat formatting template |
| tokenizer.json β tokenizer file |
| tokenizer_config.json β tokenizer settings |
| README.md β documentation |
| |
| This repository does not include checkpoints, optimizer states, or full base model weights. |
| |
| --- |
| |
| ## How to Use This Model |
| |
| Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section. |
| |
| --- |
| |
| ## Why This Matters |
| |
| This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA. |
| It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning. |
| It provides a lightweight adapter instead of requiring users to download a massive full model. |
| |
| --- |
| |
| ## Limitations |
| |
| The model can make reasoning mistakes. |
| It should not be used for exams, assignments, or professional decisions. |
| Performance depends heavily on prompt formatting. |
| |
| --- |
| |
| ## Future Work |
| |
| Train on 3,000 to 5,000 GSM8K samples. |
| Add SVAMP and ASDiv datasets. |
| Improve decoding to reduce repetition. |
| Experiment with multi-GPU scaling on MI300X. |
| Add a Streamlit demo for interactive use. |
| |
| --- |
| |
| ## License |
| |
| cc-by-nc-4.0 |