---
license: cc-by-nc-4.0
datasets:
- openai/gsm8k
language:
- en
base_model:
- Qwen/Qwen2.5-Math-1.5B
pipeline_tag: text-generation
library_name: transformers
tags:
- math
- qwen
- lora
- mathematics
- gsm8k
---

# OpenMath  
Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning  

## Overview  
OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA).  

This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter.  

The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch.  

---

## Base Model  
Qwen/Qwen2.5-Math-1.5B  

This repository does not contain the base model weights — they must be loaded from Hugging Face.  

---

## Hardware Used (Latest Training Run)  

GPU: AMD MI300X (ROCm 7.0)  
VRAM: 192 GB  
Operating System: Ubuntu 24.04  
Framework: PyTorch + Hugging Face  
Backend: ROCm  

---

## Dataset  

GSM8K (Grade School Math 8K)  
Training samples: 1,000  
Evaluation: Full GSM8K test split (1,319 problems)  

Only the solution portion of each example was used for loss computation through loss masking.  

---

## Training Configuration  

Method: QLoRA (4-bit)  
Quantization: NF4 with float16 compute  
LoRA rank: 16  
LoRA alpha: 32  
LoRA dropout: 0.05  
Target modules: q_proj, k_proj, v_proj, o_proj  
Max sequence length: 1024  
Batch size: 1  
Gradient accumulation: 16  
Effective batch size: 16  
Learning rate: 1e-4  
Optimizer: paged_adamw_8bit  
Scheduler: cosine  
Warmup: 5 percent  
Epochs: 6  

---

## Results  

GSM8K Accuracy (Full Test Set):  
750 out of 1319 correct, which equals 56.86 percent accuracy.  

This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA.  

---

## What This Repository Contains  

adapter_model.safetensors — LoRA weights  
adapter_config.json — LoRA configuration  
chat_template.jinja — chat formatting template  
tokenizer.json — tokenizer file  
tokenizer_config.json — tokenizer settings  
README.md — documentation  

This repository does not include checkpoints, optimizer states, or full base model weights.  

---

## How to Use This Model  

Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section.  

---

## Why This Matters  

This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA.  
It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning.  
It provides a lightweight adapter instead of requiring users to download a massive full model.  

---

## Limitations  

The model can make reasoning mistakes.  
It should not be used for exams, assignments, or professional decisions.  
Performance depends heavily on prompt formatting.  

---

## Future Work  

Train on 3,000 to 5,000 GSM8K samples.  
Add SVAMP and ASDiv datasets.  
Improve decoding to reduce repetition.  
Experiment with multi-GPU scaling on MI300X.  
Add a Streamlit demo for interactive use.  

---

## License  

cc-by-nc-4.0