RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments Paper • 2511.07317 • Published Nov 10, 2025 • 17
MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation Paper • 2505.17613 • Published May 23, 2025 • 8
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 98
ReasonIR: Training Retrievers for Reasoning Tasks Paper • 2504.20595 • Published Apr 29, 2025 • 54
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees Paper • 2503.08893 • Published Mar 11, 2025 • 6