LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards Paper • 2605.31584 • Published 4 days ago • 36