sitao's picture
Update README.md
67bce60 verified
---
license: mit
base_model:
- Qwen/Qwen2.5-3B
---
The model for mathematical reasoning task training from GSM8k and MATH training set by [DERL](arxiv.org/abs/2512.13399).