--- license: mit base_model: - Qwen/Qwen2.5-3B --- The model for mathematical reasoning task training from GSM8k and MATH training set by [DERL](arxiv.org/abs/2512.13399).