Maestro: Reinforcement Learning to Orchestrate Hierarchical Model-Skill Ensembles Paper • 2605.22177 • Published 3 days ago • 18
More Context, Larger Models, or Moral Knowledge? A Systematic Study of Schwartz Value Detection in Political Texts Paper • 2605.22641 • Published 3 days ago • 2
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 5 days ago • 89
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published 11 days ago • 154
TGPO: Temporal Grounded Policy Optimization for Signal Temporal Logic Tasks Paper • 2510.00225 • Published Sep 30, 2025 • 3
X-CoT: Explainable Text-to-Video Retrieval via LLM-based Chain-of-Thought Reasoning Paper • 2509.21559 • Published Sep 25, 2025 • 3
Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation Paper • 2509.19244 • Published Sep 23, 2025 • 12
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training Paper • 2505.00358 • Published May 1, 2025 • 26
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision Paper • 2505.13427 • Published May 19, 2025 • 26
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think Paper • 2505.10185 • Published May 15, 2025 • 26
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection Paper • 2505.07293 • Published May 12, 2025 • 28
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published May 20, 2025 • 62