VeriEvol: Scaling Multimodal Mathematical Reasoning via Verifiable Evol-Instruct Paper • 2606.23543 • Published 4 days ago • 6
OffSeeker: Online Reinforcement Learning Is Not All You Need for Deep Research Agents Paper • 2601.18467 • Published Jan 26 • 1
RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper • 2603.01562 • Published Mar 2 • 64