File size: 3,107 Bytes

6106b62
 
d7db17b
 
 
 
 
 
 
6106b62
 
d7db17b
6106b62
d7db17b
6106b62
7306b2b
 
6106b62
d7db17b
6106b62
d7db17b
 
 
6106b62
d7db17b
6106b62
d7db17b
6106b62
d7db17b
6106b62
d7db17b
 
 
 
 
6106b62
 
d7db17b
6106b62
d7db17b
6106b62
d7db17b
 
 
 
 
 
 
 
 
 
6106b62
 
d7db17b
6106b62
d7db17b
 
 
 
 
 
 
 
6106b62
d7db17b
 
6106b62
d7db17b
6106b62
d7db17b
6106b62
d7db17b
 
 
 
 
 
 
 
6106b62
d7db17b

---
library_name: transformers
license: apache-2.0
datasets:
- Ashed00/combined_math_problems
- openai/gsm8k
- deepmind/aqua_rat
base_model:
- HuggingFaceTB/SmolLM2-135M
---

# SmolMath-135M

SmolMath is a full finetuned version of SmolLM2-135M parameter, trained to obtain the highest math accuracy, with least drop in other text benchmarks.

**Important**: All training codes are present in the [Github](https://github.com/Ashu-00/SmolMath/)
**Important**: Please refer to the [Blog](https://hackmd.io/@ashu-00/SmolMath) for methodology and Training details.

## Usage

```python
model_path = "Ashed00/SmolMath-135M"                                                                                                                                                                                                                                                                                                # Path where your fine-tuned model is saved
from transformers import pipeline

pipe = pipeline("text-generation", model=model_path)

question = "What is 2+2?"

prompt = "Question: " + question + "\nAnswer:"

output = pipe(
    prompt,
    max_length=100,
    do_sample=False,  # disable sampling for greedy decoding
)[0]["generated_text"]


```

## Evaluation and Performance

### Comparision with Base Model
| **Metrics**       | **SmolLM2-135M-8k** | **SmolMath-135M** | **Δ (Change)** |
|-------------------|---------------------|--------------------|----------------|
| HellaSwag         | 42.1                | 41.15              | −0.95          |
| PIQA              | 68.4                | 63.55              | −4.85          |
| CommonsenseQA     | 33.9                | 33.42              | −0.48          |
| TriviaQA          | 4.1                 | 0.0                | −4.10          |
| Winogrande        | 51.3                | 51.78              | +0.48          |
| OpenBookQA        | 34.6                | 30.80              | −3.80          |
| GSM8K (0-shot)*    | 0.0                 | 6.9                | +6.90          |


*This was evaluated using the lighteval script, which is favoured by the SmolLM2 creators in their evaluation and varies from the SmolMath prompt structure.

### Math Benchmarks 
    
| Model                 | AddSub* (%) | MAWPS** (%) | GSM8K* (%) |
|----------------------|------------|-----------|-----------|
| apple/OpenELM-270M-Instruct | 2.14       | 2.83      |         2.05  |
| HuggingFaceTB/SmolLM2-135M-Instruct      | 1.52       |4.04      |   0.45        |
| SmolMath-no GRPO (ours)     | 9.64       | 7.47      |  6.22         |
| SmolMath (ours)             | **12.05**  | **8.31**  |       **7.51**    |

*Evaluated only on the test set, not included in the training
**Evaluated on complete dataset, not included in the training

## Citation

Incase you want to use this model in your work, you can site us.

```
@misc{SmolMath,
    title = {Building SmolMath: A Math Reasoning SLM Under 150M Parameters},
    url = {https://hackmd.io/@ashu-00/SmolMath},
    author = {ashu-00},
    month = {July},
    year = {2025}
}

```