Instructions to use Interplay-LM-Reasoning/extrapolation_midtrain with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Interplay-LM-Reasoning/extrapolation_midtrain with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Interplay-LM-Reasoning/extrapolation_midtrain", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| license: other | |
| library_name: transformers | |
| tags: | |
| - reasoning | |
| - mid-training | |
| - extrapolation | |
| - synthetic-data | |
| - transformers | |
| # Interplay-LM Extrapolation Mid-Train Models | |
| This repository contains the `op11-14` CPT checkpoints and corresponding local RL outputs used by `scripts/composition/op-difficulty-10B/script_cpt_rl/id2-10_0.2easy_0.3medium_0.5hard_cpt11-14`. | |
| For pretraining, only `cpt0.2-uniform_0.8-11-14_plus` is included. For RL, only final `actor/huggingface` checkpoints found locally are uploaded. | |
| ## CPT Checkpoints | |
| | Path | Checkpoint | Used by nominal step / CPT epoch | | |
| | --- | --- | --- | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-387` | checkpoint-387 | 50step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-774` | checkpoint-774 | 100step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-1548` | checkpoint-1548 | 200step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-1935` | checkpoint-1935 | 100step/0.5 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-3096` | checkpoint-3096 | 100step/0.8, 400step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-3870` | checkpoint-3870 | 500step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-4644` | checkpoint-4644 | 600step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-6579` | checkpoint-6579 | 800step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-7740` | checkpoint-7740 | 954step/0.2 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-8127` | checkpoint-8127 | 400step/0.5 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-10062` | checkpoint-10062 | 500step/0.5 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-11997` | checkpoint-11997 | 600step/0.5 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-12771` | checkpoint-12771 | 400step/0.8 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-15867` | checkpoint-15867 | 800step/0.5 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-16254` | checkpoint-16254 | 500step/0.8 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-18963` | checkpoint-18963 | 954step/0.5 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-19350` | checkpoint-19350 | 600step/0.8 | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-25542` | checkpoint-25542 | 800step/0.8 | | |
| ## RL Checkpoints | |
| | Path | Nominal step | CPT epoch | Source CPT checkpoint | Uploaded checkpoint | | |
| | --- | --- | --- | --- | --- | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-50step-0.8RL` | 50 | 0.2 | checkpoint-387 | `global_step_40` | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-100step-0.2RL` | 100 | 0.8 | checkpoint-3096 | `global_step_19` | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-100step-0.5RL` | 100 | 0.5 | checkpoint-1935 | `global_step_50` | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-100step-0.8RL` | 100 | 0.2 | checkpoint-774 | `global_step_80` | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-200step-0.8RL` | 200 | 0.2 | checkpoint-1548 | `global_step_160` | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-400step-0.2RL` | 400 | 0.8 | checkpoint-12771 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-400step-0.5RL` | 400 | 0.5 | checkpoint-8127 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-400step-0.8RL` | 400 | 0.2 | checkpoint-3096 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-500step-0.2RL` | 500 | 0.8 | checkpoint-16254 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-500step-0.5RL` | 500 | 0.5 | checkpoint-10062 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-500step-0.8RL` | 500 | 0.2 | checkpoint-3870 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-600step-0.2RL` | 600 | 0.8 | checkpoint-19350 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-600step-0.5RL` | 600 | 0.5 | checkpoint-11997 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-600step-0.8RL` | 600 | 0.2 | checkpoint-4644 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-800step-0.2RL` | 800 | 0.8 | checkpoint-25542 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-800step-0.5RL` | 800 | 0.5 | checkpoint-15867 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-800step-0.8RL` | 800 | 0.2 | checkpoint-6579 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-954step-0.5RL` | 954 | 0.5 | checkpoint-18963 | not found locally | | |
| | `id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-954step-0.8RL` | 954 | 0.2 | checkpoint-7740 | not found locally | | |
| ## Load | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| repo_id = "Interplay-LM-Reasoning/extrapolation_midtrain" | |
| subdir = "id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-25542" | |
| tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subdir) | |
| model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder=subdir) | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @misc{zhang2025interplaypretrainingmidtrainingrl, | |
| title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models}, | |
| author={Charlie Zhang and Graham Neubig and Xiang Yue}, | |
| year={2025}, | |
| eprint={2512.07783}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2512.07783}, | |
| } | |
| ``` | |