Clockz commited on
Commit
c29f49b
·
verified ·
1 Parent(s): 9c93d79

Add files using upload-large-folder tool

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ library_name: transformers
4
+ tags:
5
+ - reasoning
6
+ - context-learning
7
+ - pretraining
8
+ - synthetic-data
9
+ - transformers
10
+ ---
11
+
12
+ # Interplay-LM Context Pretrain 2
13
+
14
+ This repository contains the context B pretraining checkpoints and the corresponding final RL checkpoints. In this setting, the teacher component uses only op2 during pretraining.
15
+
16
+ Only inference-relevant Hugging Face files are included.
17
+
18
+ Within each setting:
19
+
20
+ - `base/` stores the final op2-only pretraining checkpoint.
21
+ - `rl/` stores the final RL checkpoints for each experiment variant.
22
+
23
+ ## Included settings
24
+
25
+ - `0.9zoo_op2-20+0.1teacher_op2`
26
+ - `0.99zoo_op2-20+0.01teacher_op2`
27
+ - `0.999zoo_op2-20+0.001teacher_op2`
28
+
29
+ ## Load
30
+
31
+ ```python
32
+ from transformers import AutoModelForCausalLM, AutoTokenizer
33
+
34
+ repo_id = "Interplay-LM-Reasoning/context_pretrain_2"
35
+ subdir = "0.99zoo_op2-20+0.01teacher_op2/rl/contextzoo_0.99zoo_0.01teacher_process_strict"
36
+
37
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subdir)
38
+ model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder=subdir)
39
+ ```
40
+
41
+ ## Citation
42
+
43
+ ```bibtex
44
+ @misc{zhang2025interplaypretrainingmidtrainingrl,
45
+ title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models},
46
+ author={Charlie Zhang and Graham Neubig and Xiang Yue},
47
+ year={2025},
48
+ eprint={2512.07783},
49
+ archivePrefix={arXiv},
50
+ primaryClass={cs.CL},
51
+ url={https://arxiv.org/abs/2512.07783},
52
+ }
53
+ ```