SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks Paper • 2603.24755 • Published 19 days ago • 28
RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning Paper • 2603.09160 • Published Mar 10 • 15
LETS Forecast: Learning Embedology for Time Series Forecasting Paper • 2506.06454 • Published Jun 6, 2025 • 4
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training Paper • 2505.00358 • Published May 1, 2025 • 26