Neural-Hacker commited on
Commit
d5de367
·
verified ·
1 Parent(s): cbb478c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +138 -3
README.md CHANGED
@@ -1,3 +1,138 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen2.5-Math-1.5B
7
+ pipeline_tag: question-answering
8
+ library_name: transformers
9
+ tags:
10
+ - math
11
+ - qwen
12
+ - gsm8k
13
+ - lora
14
+ ---
15
+ # OpenMath
16
+ Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning
17
+
18
+ ## Overview
19
+ **OpenMath** is an open-source project focused on fine-tuning a **small language model (SLM)** to solve **math word problems** with clear, step-by-step reasoning.
20
+ The project uses **LoRA/QLoRA fine-tuning** on popular math reasoning datasets and provides a benchmarking pipeline to compare performance against other open-source SLMs/LLMs.
21
+
22
+ This project is designed to be reproducible on **free Colab (T4) GPU**.
23
+
24
+ ---
25
+
26
+ ## What’s Included
27
+ - QLoRA fine-tuning code (4-bit)
28
+ - GSM8K subset training (example: 1k samples)
29
+ - GSM8K evaluation script (accuracy)
30
+ - Saved LoRA adapter weights
31
+
32
+ ---
33
+
34
+ ## Base Model
35
+ - **Qwen2.5-Math-1.5B**
36
+
37
+ ---
38
+
39
+ ## Dataset
40
+ - **GSM8K** (Grade School Math 8K)
41
+ - Training used: **1000 samples**
42
+ - Evaluation: GSM8K test split
43
+
44
+ ---
45
+
46
+ ## Results
47
+ ### Training Setup (Current)
48
+ - Samples: 1000
49
+ - Epochs: 6
50
+ - Max length: 1024
51
+ - LoRA rank: 16
52
+ - Loss masking: trained mainly on the **solution portion** to improve reasoning
53
+
54
+ ### Accuracy
55
+ - **GSM8K Accuracy (100-sample test subset): 41%**
56
+
57
+ > Note: The 41% score was measured on a **100-question subset** of the GSM8K test set for faster evaluation on Colab.
58
+
59
+ ---
60
+
61
+ ## GSM8K Leaderboard (Baseline)
62
+
63
+ | Model | Params | GSM8K Accuracy (%) |
64
+ |------|--------|---------------------|
65
+ | LLaMA 2 | 13B | 28.7 |
66
+ | Gemma 2 (PT) | 2B | 23.9 |
67
+ | Mistral (Base) | 7B | 36.5 |
68
+ | ERNIE 4.5 | 21B | 25.2 |
69
+ | Baichuan (Base) | 13B | 26.6 |
70
+ | Gemma | 7B | 46.4 |
71
+ | Zephyr-7b-gemma-v0.1 | 7B | 45.56 |
72
+ | LLaMA 3.2 Instruct (COT) | 1B | 39.04 |
73
+ | Gemma 3 IT | 1B | 42.15 |
74
+ | Qwen 3 (Instruct mode) | 1.7B | 33.66 |
75
+ | **OpenMath (Qwen2.5-Math-1.5B + LoRA)** | **1.5B** | **41.0** |
76
+
77
+
78
+
79
+
80
+ <img width="1090" height="590" alt="image" src="https://github.com/user-attachments/assets/662ea336-8946-4542-b2f2-eb78712d5a2d" />
81
+
82
+
83
+
84
+
85
+
86
+ ---
87
+
88
+ ## Repository Files
89
+ ### LoRA Adapter Folder
90
+ This project provides the fine-tuned adapter weights:
91
+
92
+ - `adapter_model.safetensors` → LoRA weights
93
+ - `adapter_config.json` → LoRA configuration
94
+ - Tokenizer + template files for correct formatting
95
+
96
+ > Note: This is **not a full model**.
97
+ > You must load the **base model** and then attach the adapter.
98
+
99
+ ---
100
+
101
+ ## Disclaimer
102
+ OpenMath is an educational/research project.
103
+ The fine-tuned model may produce incorrect, incomplete, or misleading answers.
104
+ Always verify solutions independently before using them for exams, assignments, or real-world decisions.
105
+
106
+ This project does **not** guarantee correctness and should not be used as a substitute for professional advice.
107
+
108
+ ---
109
+
110
+ ## Contributing
111
+ Contributions are welcome! 🎉
112
+
113
+ If you’d like to contribute:
114
+ 1. Fork the repository
115
+ 2. Create a new branch (`feature/your-feature-name`)
116
+ 3. Commit your changes
117
+ 4. Open a Pull Request
118
+
119
+ ### Contribution Ideas
120
+ - Run full GSM8K test evaluation (1319 samples) and report results
121
+ - Train on larger GSM8K subsets (3k–5k samples)
122
+ - Add SVAMP / ASDiv datasets for better generalization
123
+ - Improve decoding to reduce repetition
124
+ - Add a Streamlit demo for interactive testing
125
+ - Benchmark against more open-source SLMs/LLMs
126
+ - Improve evaluation scripts and metrics
127
+
128
+ ---
129
+
130
+ ## Note
131
+ OpenMath is a **fun and practical side project** built to explore **efficient fine-tuning (QLoRA)** and **math reasoning evaluation** on limited compute.
132
+ The goal is to learn, experiment, and share reproducible results — while keeping the code clean and open for community improvements.
133
+
134
+ ---
135
+
136
+ ## License
137
+ This project is licensed under the **Apache License 2.0**.
138
+ See the [LICENSE](LICENSE) file for details.