Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards
Paper
•
2601.06021
•
Published
•
43
None defined yet.
DeepPrune: Parallel Scaling without Inter-trace Redundancy
SIRI: Scaling Iterative Reinforcement Learning with Interleaved Compression