Add metadata and improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +28 -17
README.md CHANGED
@@ -1,22 +1,33 @@
1
- # Official Repo of Reagent Agent Reasoning Reward Model (Agent-RRM).
2
- Paper: https://arxiv.org/abs/2601.22154
 
 
 
3
 
4
- ## Abstract:
5
- Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use.
6
- However, most methods still relies on sparse outcome-based reward for training.
7
- Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results.
8
- In this paper, we introduce \textbf{Agent Reasoning Reward Model (Agent-RRM)}, a multi-faceted reward model that produces structured feedback for agentic trajectories, including (1) an explicit reasoning trace , (2) a focused critique that provides refinement guidance by highlighting reasoning flaws, and (3) an overall score that evaluates process performance.
9
- Leveraging these signals, we systematically investigate three integration strategies: \textbf{Reagent-C} (text-augmented refinement), \textbf{Reagent-R} (reward-augmented guidance), and \textbf{Reagent-U} (unified feedback integration).
10
- Extensive evaluations across 12 diverse benchmarks demonstrate that Reagent-U yields substantial performance leaps, achieving 43.7\% on GAIA and 46.2\% on WebWalkerQA, validating the effectiveness of our reasoning reward model and training schemes.
11
 
12
- ## GitHub Repository
13
- The official codebase, including training and evaluation scripts for Reagent, can be found on the project's GitHub repository: https://github.com/kxfan2002/Reagent
14
 
15
- ## Citation
 
 
 
 
16
 
17
- ```bash
18
 
19
- ```
20
- ---
21
- license: apache-2.0
22
- ---
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: reinforcement-learning
5
+ ---
6
 
7
+ # Agent Reasoning Reward Model (Agent-RRM)
 
 
 
 
 
 
8
 
9
+ This repository contains the weights for **Agent-RRM**, introduced in the paper [Exploring Reasoning Reward Model for Agents](https://huggingface.co/papers/2601.22154).
 
10
 
11
+ ## Introduction
12
+ Agent Reasoning Reward Model (Agent-RRM) is a multi-faceted reward model designed for Agentic Reinforcement Learning. Unlike traditional sparse outcome-based rewards, Agent-RRM provides structured feedback for agentic trajectories, including:
13
+ 1. **Explicit reasoning trace**: Step-by-step reasoning analysis.
14
+ 2. **Focused critique**: Refinement guidance highlighting reasoning flaws.
15
+ 3. **Overall score**: Evaluation of process performance.
16
 
17
+ These signals enable training strategies like **Reagent-U**, which has demonstrated significant performance leaps on benchmarks such as GAIA and WebWalkerQA.
18
 
19
+ ## Resources
20
+ - **Paper:** [Exploring Reasoning Reward Model for Agents](https://huggingface.co/papers/2601.22154)
21
+ - **GitHub Repository:** [kxfan2002/Reagent](https://github.com/kxfan2002/Reagent)
22
+
23
+ ## Citation
24
+ If you find this work helpful, please consider citing:
25
+
26
+ ```bibtex
27
+ @article{fan2025exploring,
28
+ title={Exploring Reasoning Reward Model for Agents},
29
+ author={Kaixuan Fan and Kaituo Feng and Manyuan Zhang and Tianshuo Peng and Zhixun Li and Yilei Jiang and Shuang Chen and Peng Pei and Xunliang Cai and Xiangyu Yue},
30
+ journal={arXiv preprint arXiv:2601.22154},
31
+ year={2025}
32
+ }
33
+ ```