--- license: other library_name: transformers tags: - reasoning - context-learning - pretraining - synthetic-data - transformers --- # Interplay-LM Context Pretrain 2 This repository contains the context B pretraining checkpoints and the corresponding final RL checkpoints. In this setting, the teacher component uses only op2 during pretraining. Only inference-relevant Hugging Face files are included. Within each setting: - `base/` stores the final op2-only pretraining checkpoint. - `rl/` stores the final RL checkpoints for each experiment variant. ## Included settings - `0.9zoo_op2-20+0.1teacher_op2` - `0.99zoo_op2-20+0.01teacher_op2` - `0.999zoo_op2-20+0.001teacher_op2` ## Load ```python from transformers import AutoModelForCausalLM, AutoTokenizer repo_id = "Interplay-LM-Reasoning/context_pretrain_2" subdir = "0.99zoo_op2-20+0.01teacher_op2/rl/contextzoo_0.99zoo_0.01teacher_process_strict" tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subdir) model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder=subdir) ``` ## Citation ```bibtex @misc{zhang2025interplaypretrainingmidtrainingrl, title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models}, author={Charlie Zhang and Graham Neubig and Xiang Yue}, year={2025}, eprint={2512.07783}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2512.07783}, } ```