EMO: Pretraining Mixture of Experts for Emergent Modularity Paper • 2605.06663 • Published May 7 • 12
view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • May 8 • 38
Teaching Models to Understand (but not Generate) High-risk Data Paper • 2505.03052 • Published May 5, 2025 • 6
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries Paper • 2502.20475 • Published Feb 27, 2025 • 3