Reagent / README.md

nielsr HF Staff

Add metadata and improve model card

271849e verified about 1 month ago

2.02 kB

license: apache-2.0
pipeline_tag: reinforcement-learning
library_name: transformers
tags:
  - agent
  - reward-model
  - reasoning
  - RL

Agent Reasoning Reward Model (Agent-RRM)

This is the official repository for Agent-RRM, introduced in the paper Exploring Reasoning Reward Model for Agents.

Paper: Exploring Reasoning Reward Model for Agents
Code: GitHub - kxfan2002/Reagent

Introduction

Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use. However, most methods still rely on sparse outcome-based rewards for training. Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results.

In this paper, we introduce Agent Reasoning Reward Model (Agent-RRM), a multi-faceted reward model that produces structured feedback for agentic trajectories, including:

An explicit reasoning trace: Step-by-step reasoning analysis.
A focused critique: Refinement guidance highlighting reasoning flaws.
An overall score: Process performance evaluation.

Leveraging these signals, we systematically investigate three integration strategies: Reagent-C (text-augmented refinement), Reagent-R (reward-augmented guidance), and Reagent-U (unified feedback integration). Extensive evaluations across 12 diverse benchmarks demonstrate that Reagent-U yields substantial performance leaps, achieving 43.7% on GAIA and 46.2% on WebWalkerQA.

Citation

If you find this work helpful, please consider citing:

@article{fan2025exploring,
  title={Exploring Reasoning Reward Model for Agents},
  author={Kaixuan Fan and Kaituo Feng and Manyuan Zhang and Tianshuo Peng and Zhixun Li and Yilei Jiang and Shuang Chen and Peng Pei and Xunliang Cai and Xiangyu Yue},
  journal={arXiv preprint arXiv:2601.22154},
  year={2025}
}