|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
datasets: |
|
|
- openai/gsm8k |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-Math-1.5B |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
tags: |
|
|
- math |
|
|
- qwen |
|
|
- lora |
|
|
- mathematics |
|
|
- gsm8k |
|
|
--- |
|
|
|
|
|
# OpenMath |
|
|
Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning |
|
|
|
|
|
## Overview |
|
|
OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA). |
|
|
|
|
|
This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter. |
|
|
|
|
|
The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch. |
|
|
|
|
|
--- |
|
|
|
|
|
## Base Model |
|
|
Qwen/Qwen2.5-Math-1.5B |
|
|
|
|
|
This repository does not contain the base model weights — they must be loaded from Hugging Face. |
|
|
|
|
|
--- |
|
|
|
|
|
## Hardware Used (Latest Training Run) |
|
|
|
|
|
GPU: AMD MI300X (ROCm 7.0) |
|
|
VRAM: 192 GB |
|
|
Operating System: Ubuntu 24.04 |
|
|
Framework: PyTorch + Hugging Face |
|
|
Backend: ROCm |
|
|
|
|
|
--- |
|
|
|
|
|
## Dataset |
|
|
|
|
|
GSM8K (Grade School Math 8K) |
|
|
Training samples: 1,000 |
|
|
Evaluation: Full GSM8K test split (1,319 problems) |
|
|
|
|
|
Only the solution portion of each example was used for loss computation through loss masking. |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
Method: QLoRA (4-bit) |
|
|
Quantization: NF4 with float16 compute |
|
|
LoRA rank: 16 |
|
|
LoRA alpha: 32 |
|
|
LoRA dropout: 0.05 |
|
|
Target modules: q_proj, k_proj, v_proj, o_proj |
|
|
Max sequence length: 1024 |
|
|
Batch size: 1 |
|
|
Gradient accumulation: 16 |
|
|
Effective batch size: 16 |
|
|
Learning rate: 1e-4 |
|
|
Optimizer: paged_adamw_8bit |
|
|
Scheduler: cosine |
|
|
Warmup: 5 percent |
|
|
Epochs: 6 |
|
|
|
|
|
--- |
|
|
|
|
|
## Results |
|
|
|
|
|
GSM8K Accuracy (Full Test Set): |
|
|
750 out of 1319 correct, which equals 56.86 percent accuracy. |
|
|
|
|
|
This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA. |
|
|
|
|
|
--- |
|
|
|
|
|
## What This Repository Contains |
|
|
|
|
|
adapter_model.safetensors — LoRA weights |
|
|
adapter_config.json — LoRA configuration |
|
|
chat_template.jinja — chat formatting template |
|
|
tokenizer.json — tokenizer file |
|
|
tokenizer_config.json — tokenizer settings |
|
|
README.md — documentation |
|
|
|
|
|
This repository does not include checkpoints, optimizer states, or full base model weights. |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use This Model |
|
|
|
|
|
Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section. |
|
|
|
|
|
--- |
|
|
|
|
|
## Why This Matters |
|
|
|
|
|
This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA. |
|
|
It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning. |
|
|
It provides a lightweight adapter instead of requiring users to download a massive full model. |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
The model can make reasoning mistakes. |
|
|
It should not be used for exams, assignments, or professional decisions. |
|
|
Performance depends heavily on prompt formatting. |
|
|
|
|
|
--- |
|
|
|
|
|
## Future Work |
|
|
|
|
|
Train on 3,000 to 5,000 GSM8K samples. |
|
|
Add SVAMP and ASDiv datasets. |
|
|
Improve decoding to reduce repetition. |
|
|
Experiment with multi-GPU scaling on MI300X. |
|
|
Add a Streamlit demo for interactive use. |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
cc-by-nc-4.0 |