| | --- |
| | license: mit |
| | --- |
| | |
| | <h1 align="center"> |
| | On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models |
| | </h1> |
| |
|
| | <div align="center"> |
| |
|
| | <a href="https://chenlong-clock.github.io">Charlie Zhang</a>, <a href="https://www.phontron.com">Graham Neubig</a>, |
| | <a href="https://xiangyue9607.github.io">Xiang Yue</a> |
| |
|
| | Carnegie Mellon University, Language Technologies Institute |
| |
|
| | </div> |
| |
|
| | <div align="center"> |
| |
|
| | [](https://arxiv.org/abs/2512.07783) |
| | [](LICENSE) |
| |  |
| |
|
| | </div> |
| |
|
| |
|
| | This repository contains post-training related checkpoints in extrapolation tasks. |
| |
|
| | ## 📚 Citation |
| |
|
| | If you find this work or code useful, please consider citing: |
| |
|
| | ```bibtex |
| | @misc{zhang2025interplaypretrainingmidtrainingrl, |
| | title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models}, |
| | author={Charlie Zhang and Graham Neubig and Xiang Yue}, |
| | year={2025}, |
| | eprint={2512.07783}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2512.07783}, |
| | } |
| | ``` |