MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy
Paper
•
2508.05592
•
Published
•
6
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy
This model is a fine-tuned version of Qwen3/Qwen3-8B on the MathSmith-HC shortCoT setting.
Check Details at https://github.com/Jasaxion/MathSmith
If you find this work useful, please cite:
@article{zhan2025mathsmith,
title={MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy},
author={Zhan, Shaoxiong and Lai, Yanlin and Lu, Ziyu and Lin, Dahua and Yang, Ziqing and Tan, Fei},
journal={arXiv preprint arXiv:2508.05592},
year={2025}
}