RewardHarness: Self-Evolving Agentic Post-Training
Paper • 2605.08703 • Published • 2
Natural Language Processing, Image Generation
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time