Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,89 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
<p align="center">
|
| 8 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/66a056d0229269a861ac1245/9xXXms4ub9dzbcsN1IGqq.png" alt="ReasonLite" width="200">
|
| 9 |
+
</p>
|
| 10 |
+
|
| 11 |
+
<p align="center">
|
| 12 |
+
<a href="https://github.com/AMD-AIG-AIMA/ReasonLite"><b>GitHub</b></a> |
|
| 13 |
+
<a href="https://huggingface.co/datasets/amd/ReasonLite-Dataset"><b>Dataset</b></a> |
|
| 14 |
+
<b>Blog</b></a>
|
| 15 |
+
|
| 16 |
+
</p>
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
**ReasonLite** is an ultra-lightweight math reasoning model. With only 0.6B parameters, it leverages high-quality data distillation to achieve performance comparable to models over 10Γ its size, such as Qwen3-8B, **reaching 75.2 on AIME24 and extending the scaling law of small models.**
|
| 20 |
+
|
| 21 |
+
* π₯ **Best-performing 0.6B reasoning model**
|
| 22 |
+
* π Fully open-source β weights, scripts, datasets, synthesis pipeline
|
| 23 |
+
* βοΈ Distilled in two stages for both **efficiency** and **high performance**
|
| 24 |
+
|
| 25 |
+
<p align="center">
|
| 26 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/66a056d0229269a861ac1245/2VZPy7mlgpq9vFvwDc00Q.png"" alt="ReasonLite" height="500">
|
| 27 |
+
</p>
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
# π Model
|
| 33 |
+
|
| 34 |
+
The model is trained in **two progressive distillation stages**.
|
| 35 |
+
First, short-CoT data is used to distill **Qwen3-0.6B** into **AMD-0.6B-Turbo**, improving **AIME24 accuracy from 11.0 β 57.1**.
|
| 36 |
+
Then, long-CoT data is used to obtain **AMD-0.6B**, further boosting accuracy to **75.2**.
|
| 37 |
+
|
| 38 |
+
| Model | Description | AIME24 | Link |
|
| 39 |
+
| ------------------------- | ----------------------------------------------| ------ | ---- |
|
| 40 |
+
| **amd/ReasonLite-0.6B-Turbo** | Short CoT balancing performance and efficiency | 57.1 | [π€ HuggingFace](https://huggingface.co/amd/ReasonLite-0.6B-Turbo) |
|
| 41 |
+
| **amd/ReasonLite-0.6B** | Long CoT for high performance | 75.2 | [π€ HuggingFace](https://huggingface.co/amd/ReasonLite-0.6B) |
|
| 42 |
+
|
| 43 |
+
---
|
| 44 |
+
|
| 45 |
+
# π Evaluation Results
|
| 46 |
+
|
| 47 |
+
**Metrics**
|
| 48 |
+
|
| 49 |
+
* **avg@16** β average accuracy from 16 sampled answers
|
| 50 |
+
* **pass@8** β probability at least one correct answer appears among 8 samples
|
| 51 |
+
|
| 52 |
+
| Model | Parameters | AMC23 avg@16 | AMC23 pass@8 | AIME25 avg@16 | AIME25 pass@8 | AIME24 avg@16 | AIME24 pass@8 |
|
| 53 |
+
|---------------------------|------------|-------------|-------------|---------------|---------------|---------------|---------------|
|
| 54 |
+
| Qwen2.5-14B | 14B | 58.3 | 82.3 | 12.3 | 32.3 | 12.7 | 32.4 |
|
| 55 |
+
| Deepseek-qwen-14B | 14B | 93.9 | 98.7 | 50.2 | 71.0 | 65.0 | 83.0 |
|
| 56 |
+
| Qwen3-0.6B | 0.6B | 52.7 | 85.0 | 16.0 | 33.0 | 11.0 | 31.5 |
|
| 57 |
+
| Qwen3-1.7B | 1.7B | 83.4 | 96.3 | 36.0 | 55.1 | 47.3 | 73.9 |
|
| 58 |
+
| Qwen3-4B | 4B | 96.1 | 100 | 63.5 | 85.4 | 72.7 | 85.1 |
|
| 59 |
+
| Qwen3-8B | 8B | 94.8 | 100 | 68.3 | 84.2 | 74.6 | 85.0 |
|
| 60 |
+
| Qwen3-14B | 14B | 98.6 | 98.7 | 71.5 | 84.1 | 78.3 | 88.4 |
|
| 61 |
+
| DeepscaleR-1.5B | 1.5B | 83.8 | 95.0 | 29.0 | 48.9 | 40.4 | 69.0 |
|
| 62 |
+
| POLARIS-1.7B-Preview | 1.7B | 92.2 | 97.4 | 52.3 | 80.2 | 65.0 | 76.7 |
|
| 63 |
+
| OpenMath-Nemotron-1.5B | 1.5B | 88.8 | 96.7 | 39.8 | 65.8 | 61.5 | 81.3 |
|
| 64 |
+
| ReasonLite-0.6B-Turbo | 0.6B | 81.6 | 99.3 | 42.7 | 69.2 | 57.1 | 79.6 |
|
| 65 |
+
| **ReasonLite-0.6B** | **0.6B** | **95.2** | **100** | **62.9** | **84.1** | **75.2** | **90.2** |
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
# π Dataset
|
| 71 |
+
|
| 72 |
+
| Dataset | Description | Size | Link |
|
| 73 |
+
| ---------------------- | ------ |---- | ---- |
|
| 74 |
+
| **amd/ReasonLite-Dataset** | Short CoT | 4.3M | [π€ HuggingFace](https://huggingface.co/datasets/amd/ReasonLite-Dataset/viewer/default/medium) |
|
| 75 |
+
| **amd/ReasonLite-Dataset** | Long Cot | 1.8M | [π€ HuggingFace](https://huggingface.co/datasets/amd/ReasonLite-Dataset/viewer/default/high) |
|
| 76 |
+
|
| 77 |
+
---
|
| 78 |
+
|
| 79 |
+
# π Citation
|
| 80 |
+
|
| 81 |
+
```bibtex
|
| 82 |
+
@misc{reasonlite2025,
|
| 83 |
+
title={ReasonLite: Ultra-Lightweight Math Reasoning Model},
|
| 84 |
+
author={AMD AI Lab},
|
| 85 |
+
year={2025},
|
| 86 |
+
url={https://huggingface.co/amd/ReasonLite-0.6B}
|
| 87 |
+
}
|
| 88 |
+
```
|
| 89 |
+
|