Clockz commited on
Commit
b2a1a4d
·
verified ·
1 Parent(s): f0f239f

Add files using upload-large-folder tool

Browse files
0.999zoo_op2-20+0.001teacher_op2/README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ library_name: transformers
4
+ tags:
5
+ - reasoning
6
+ - context-learning
7
+ - pretraining
8
+ - synthetic-data
9
+ - transformers
10
+ ---
11
+
12
+ # 0.999zoo_op2-20+0.001teacher_op2
13
+
14
+ 99.9% zoo op2-20, 0.1% teacher op2 pretraining mixture. This directory contains the final op2-only pretraining checkpoint and corresponding final RL checkpoints.
15
+
16
+ This is a context B pretraining checkpoint where the teacher component uses only op2.
17
+
18
+ ## Citation
19
+
20
+ ```bibtex
21
+ @misc{zhang2025interplaypretrainingmidtrainingrl,
22
+ title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models},
23
+ author={Charlie Zhang and Graham Neubig and Xiang Yue},
24
+ year={2025},
25
+ eprint={2512.07783},
26
+ archivePrefix={arXiv},
27
+ primaryClass={cs.CL},
28
+ url={https://arxiv.org/abs/2512.07783},
29
+ }
30
+ ```