OpenMath / README.md

Update README.md

4ad2b1d verified 1 day ago

3.67 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-Math-1.5B
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- math
	- qwen
	- gsm8k
	- lora
	---
	# OpenMath
	Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning

	## Overview
	OpenMath is an open-source project focused on fine-tuning a small language model (SLM) to solve math word problems with clear, step-by-step reasoning.
	The project uses LoRA/QLoRA fine-tuning on popular math reasoning datasets and provides a benchmarking pipeline to compare performance against other open-source SLMs/LLMs.

	This project is designed to be reproducible on free Colab (T4) GPU.

	---

	## What’s Included
	- QLoRA fine-tuning code (4-bit)
	- GSM8K subset training (example: 1k samples)
	- GSM8K evaluation script (accuracy)
	- Saved LoRA adapter weights

	---

	## Base Model
	- Qwen2.5-Math-1.5B

	---

	## Dataset
	- GSM8K (Grade School Math 8K)
	- Training used: 1000 samples
	- Evaluation: GSM8K test split

	---

	## Results
	### Training Setup (Current)
	- Samples: 1000
	- Epochs: 6
	- Max length: 1024
	- LoRA rank: 16
	- Loss masking: trained mainly on the solution portion to improve reasoning

	### Accuracy
	- GSM8K Accuracy (100-sample test subset): 41%

	> Note: The 41% score was measured on a 100-question subset of the GSM8K test set for faster evaluation on Colab.

	---

	## GSM8K Leaderboard (Baseline)

	\| Model \| Params \| GSM8K Accuracy (%) \|
	\|------\|--------\|---------------------\|
	\| LLaMA 2 \| 13B \| 28.7 \|
	\| Gemma 2 (PT) \| 2B \| 23.9 \|
	\| Mistral (Base) \| 7B \| 36.5 \|
	\| ERNIE 4.5 \| 21B \| 25.2 \|
	\| Baichuan (Base) \| 13B \| 26.6 \|
	\| Gemma \| 7B \| 46.4 \|
	\| Zephyr-7b-gemma-v0.1 \| 7B \| 45.56 \|
	\| LLaMA 3.2 Instruct (CoT) \| 1B \| 39.04 \|
	\| Gemma 3 IT \| 1B \| 42.15 \|
	\| Qwen 3 (Instruct mode) \| 1.7B \| 33.66 \|
	\| OpenMath (Qwen2.5-Math-1.5B + LoRA) \| 1.5B \| 41.0 \|




	<img width="1090" height="590" alt="image" src="https://github.com/user-attachments/assets/662ea336-8946-4542-b2f2-eb78712d5a2d" />





	---

	## Repository Files
	### LoRA Adapter Folder
	This project provides the fine-tuned adapter weights:

	- `adapter_model.safetensors` → LoRA weights
	- `adapter_config.json` → LoRA configuration

	> Note: This is not a full model.
	> You must load the base model and then attach the adapter.

	---

	## Disclaimer
	OpenMath is an educational/research project.

	The fine-tuned model may produce incorrect, incomplete, or misleading answers.

	Always verify solutions independently before using them for exams, assignments, or real-world decisions.

	This project does not guarantee correctness and should not be used as a substitute for professional advice.

	---

	## Contributing
	Contributions are welcome! 🎉

	If you’d like to contribute:
	1. Fork the repository
	2. Create a new branch (`feature/your-feature-name`)
	3. Commit your changes
	4. Open a Pull Request

	### Contribution Ideas
	- Run full GSM8K test evaluation (1319 samples) and report results
	- Train on larger GSM8K subsets (3k–5k samples)
	- Add SVAMP / ASDiv datasets for better generalization
	- Improve decoding to reduce repetition
	- Add a Streamlit demo for interactive testing
	- Benchmark against more open-source SLMs/LLMs
	- Improve evaluation scripts and metrics

	---

	## Note
	OpenMath is a fun and practical side project built to explore efficient fine-tuning (QLoRA) and math reasoning evaluation on limited compute.

	The goal is to learn, experiment, and share reproducible results — while keeping the code clean and open for community improvements.

	---

	## License
	This project is licensed under the Apache License 2.0.