yashmarathe
/

MathMind

Model card Files Files and versions

MathMind / README.md

yashmarathe's picture

Create README.md

99fea5b verified about 1 month ago

|

history blame contribute delete

1.14 kB

	# MathMind

	Evaluations:

	\| Model \| AIME 2024 \| MATH 500 \| AMC 2023 \| Minerva Math \| OlympiadBench \| Avg. \|
	\|-------\|-----------\|-----------\|-----------\|--------------\|---------------\|------\|
	\| Qwen2.5-Math-7B-Instruct \| 13.3 \| 79.8 \| 50.6 \| 34.6 \| 40.7 \| 43.8 \|
	\| rStar-Math-7B \| 26.7 \| 78.4 \| 47.5 \| - \| 47.1 \| - \|
	\| Eurus-2-7B-PRIME \| 26.7 \| 79.2 \| 57.8 \| 38.6 \| 42.1 \| 48.9 \|
	\| Qwen2.5-7B-SimpleRL \| 26.7 \| 82.4 \| 62.5 \| <strong>39.7</strong> \| 43.3 \| 50.9 \|
	\| DeepSeek-R1-Distill-Qwen-1.5B \| 28.8 \| 82.8 \| 62.9 \| 26.5 \| 43.3 \| 48.9 \|
	\| Still-1.5B \| 32.5 \| 84.4 \| 66.7 \| 29.0 \| 45.4 \| 51.6 \|
	\| DeepScaleR-1.5B-Preview \| <strong>43.1</strong> \| <strong>87.8</strong> \| <strong>73.6</strong> \| 30.2 \| 50.0\| 57.0 \|
	\| <strong>MathMind</strong> \| 40.8 \| 85.2 \| 70.9 \| 30.2 \| <strong>52.0</strong> \| <strong>56.1</strong> \|
	\| O1-Preview \| 40.0 \| 81.4 \| - \| - \| - \| - \|

	## Acknowledgements

	- The training experiments are powered by [verl](https://github.com/volcengine/verl), an open-source RLHF library.
	- The model is trained on top of [`DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B).