CaRR & C-GRPO Collection Data and models for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards". • 5 items • Updated about 6 hours ago