Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning Paper • 2606.24133 • Published 11 days ago • 11
CausalMix: Data Mixture as Causal Inference for Language Model Training Paper • 2607.01104 • Published 3 days ago • 16