Interplay-LM-Reasoning
/

context_pretrain_2

context-learning

Model card Files Files and versions

context_pretrain_2 / README.md

Clockz's picture

Add files using upload-large-folder tool

c29f49b verified 7 days ago

|

history blame contribute delete

1.45 kB

	---
	license: other
	library_name: transformers
	tags:
	- reasoning
	- context-learning
	- pretraining
	- synthetic-data
	- transformers
	---

	# Interplay-LM Context Pretrain 2

	This repository contains the context B pretraining checkpoints and the corresponding final RL checkpoints. In this setting, the teacher component uses only op2 during pretraining.

	Only inference-relevant Hugging Face files are included.

	Within each setting:

	- `base/` stores the final op2-only pretraining checkpoint.
	- `rl/` stores the final RL checkpoints for each experiment variant.

	## Included settings

	- `0.9zoo_op2-20+0.1teacher_op2`
	- `0.99zoo_op2-20+0.01teacher_op2`
	- `0.999zoo_op2-20+0.001teacher_op2`

	## Load

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	repo_id = "Interplay-LM-Reasoning/context_pretrain_2"
	subdir = "0.99zoo_op2-20+0.01teacher_op2/rl/contextzoo_0.99zoo_0.01teacher_process_strict"

	tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subdir)
	model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder=subdir)
	```

	## Citation

	```bibtex
	@misc{zhang2025interplaypretrainingmidtrainingrl,
	title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models},
	author={Charlie Zhang and Graham Neubig and Xiang Yue},
	year={2025},
	eprint={2512.07783},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2512.07783},
	}
	```