File size: 3,665 Bytes
d5de367 3cebcee d5de367 4ad2b1d d5de367 b62d99c d5de367 b62d99c d5de367 b62d99c d5de367 e7e1797 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-Math-1.5B
pipeline_tag: text-generation
library_name: transformers
tags:
- math
- qwen
- gsm8k
- lora
---
# OpenMath
Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning
## Overview
**OpenMath** is an open-source project focused on fine-tuning a **small language model (SLM)** to solve **math word problems** with clear, step-by-step reasoning.
The project uses **LoRA/QLoRA fine-tuning** on popular math reasoning datasets and provides a benchmarking pipeline to compare performance against other open-source SLMs/LLMs.
This project is designed to be reproducible on **free Colab (T4) GPU**.
---
## What’s Included
- QLoRA fine-tuning code (4-bit)
- GSM8K subset training (example: 1k samples)
- GSM8K evaluation script (accuracy)
- Saved LoRA adapter weights
---
## Base Model
- **Qwen2.5-Math-1.5B**
---
## Dataset
- **GSM8K** (Grade School Math 8K)
- Training used: **1000 samples**
- Evaluation: GSM8K test split
---
## Results
### Training Setup (Current)
- Samples: 1000
- Epochs: 6
- Max length: 1024
- LoRA rank: 16
- Loss masking: trained mainly on the **solution portion** to improve reasoning
### Accuracy
- **GSM8K Accuracy (100-sample test subset): 41%**
> Note: The 41% score was measured on a **100-question subset** of the GSM8K test set for faster evaluation on Colab.
---
## GSM8K Leaderboard (Baseline)
| Model | Params | GSM8K Accuracy (%) |
|------|--------|---------------------|
| LLaMA 2 | 13B | 28.7 |
| Gemma 2 (PT) | 2B | 23.9 |
| Mistral (Base) | 7B | 36.5 |
| ERNIE 4.5 | 21B | 25.2 |
| Baichuan (Base) | 13B | 26.6 |
| Gemma | 7B | 46.4 |
| Zephyr-7b-gemma-v0.1 | 7B | 45.56 |
| LLaMA 3.2 Instruct (CoT) | 1B | 39.04 |
| Gemma 3 IT | 1B | 42.15 |
| Qwen 3 (Instruct mode) | 1.7B | 33.66 |
| **OpenMath (Qwen2.5-Math-1.5B + LoRA)** | **1.5B** | **41.0** |
<img width="1090" height="590" alt="image" src="https://github.com/user-attachments/assets/662ea336-8946-4542-b2f2-eb78712d5a2d" />
---
## Repository Files
### LoRA Adapter Folder
This project provides the fine-tuned adapter weights:
- `adapter_model.safetensors` → LoRA weights
- `adapter_config.json` → LoRA configuration
> Note: This is **not a full model**.
> You must load the **base model** and then attach the adapter.
---
## Disclaimer
OpenMath is an educational/research project.
The fine-tuned model may produce incorrect, incomplete, or misleading answers.
Always verify solutions independently before using them for exams, assignments, or real-world decisions.
This project does **not** guarantee correctness and should not be used as a substitute for professional advice.
---
## Contributing
Contributions are welcome! 🎉
If you’d like to contribute:
1. Fork the repository
2. Create a new branch (`feature/your-feature-name`)
3. Commit your changes
4. Open a Pull Request
### Contribution Ideas
- Run full GSM8K test evaluation (1319 samples) and report results
- Train on larger GSM8K subsets (3k–5k samples)
- Add SVAMP / ASDiv datasets for better generalization
- Improve decoding to reduce repetition
- Add a Streamlit demo for interactive testing
- Benchmark against more open-source SLMs/LLMs
- Improve evaluation scripts and metrics
---
## Note
OpenMath is a **fun and practical side project** built to explore **efficient fine-tuning (QLoRA)** and **math reasoning evaluation** on limited compute.
The goal is to learn, experiment, and share reproducible results — while keeping the code clean and open for community improvements.
---
## License
This project is licensed under the **Apache License 2.0**. |