From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 12 days ago • 26
Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning Paper • 2606.13106 • Published 17 days ago • 21
ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall Paper • 2510.07896 • Published Oct 9, 2025 • 11
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL Paper • 2605.18703 • Published May 18 • 50
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs Paper • 2605.30501 • Published about 1 month ago • 29