Add metadata and improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +34 -17
README.md CHANGED
@@ -1,24 +1,41 @@
1
- # Official Repo of Reagent.
2
- Paper: https://arxiv.org/abs/2601.22154
 
 
 
 
 
 
 
 
 
 
 
 
3
 
4
- Code: https://github.com/kxfan2002/Reagent
 
5
 
6
- ## Abstract:
7
- Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use.
8
- However, most methods still relies on sparse outcome-based reward for training.
9
- Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results.
10
- In this paper, we introduce \textbf{Agent Reasoning Reward Model (Agent-RRM)}, a multi-faceted reward model that produces structured feedback for agentic trajectories, including (1) an explicit reasoning trace , (2) a focused critique that provides refinement guidance by highlighting reasoning flaws, and (3) an overall score that evaluates process performance.
11
- Leveraging these signals, we systematically investigate three integration strategies: \textbf{Reagent-C} (text-augmented refinement), \textbf{Reagent-R} (reward-augmented guidance), and \textbf{Reagent-U} (unified feedback integration).
12
- Extensive evaluations across 12 diverse benchmarks demonstrate that Reagent-U yields substantial performance leaps, achieving 43.7\% on GAIA and 46.2\% on WebWalkerQA, validating the effectiveness of our reasoning reward model and training schemes.
13
 
14
- ## GitHub Repository
15
- The official codebase, including training and evaluation scripts for Reagent, can be found on the project's GitHub repository: https://github.com/kxfan2002/Reagent
 
 
 
 
 
 
16
 
17
  ## Citation
18
 
19
- ```bash
20
 
21
- ```
22
- ---
23
- license: apache-2.0
24
- ---
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: reinforcement-learning
4
+ library_name: transformers
5
+ tags:
6
+ - agent
7
+ - reward-model
8
+ - reasoning
9
+ - RL
10
+ ---
11
+
12
+ # Agent Reasoning Reward Model (Agent-RRM)
13
+
14
+ This is the official repository for **Agent-RRM**, introduced in the paper [Exploring Reasoning Reward Model for Agents](https://arxiv.org/abs/2601.22154).
15
 
16
+ - **Paper:** [Exploring Reasoning Reward Model for Agents](https://arxiv.org/abs/2601.22154)
17
+ - **Code:** [GitHub - kxfan2002/Reagent](https://github.com/kxfan2002/Reagent)
18
 
19
+ ## Introduction
 
 
 
 
 
 
20
 
21
+ Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use. However, most methods still rely on sparse outcome-based rewards for training. Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results.
22
+
23
+ In this paper, we introduce **Agent Reasoning Reward Model (Agent-RRM)**, a multi-faceted reward model that produces structured feedback for agentic trajectories, including:
24
+ 1. **An explicit reasoning trace**: Step-by-step reasoning analysis.
25
+ 2. **A focused critique**: Refinement guidance highlighting reasoning flaws.
26
+ 3. **An overall score**: Process performance evaluation.
27
+
28
+ Leveraging these signals, we systematically investigate three integration strategies: **Reagent-C** (text-augmented refinement), **Reagent-R** (reward-augmented guidance), and **Reagent-U** (unified feedback integration). Extensive evaluations across 12 diverse benchmarks demonstrate that Reagent-U yields substantial performance leaps, achieving 43.7% on GAIA and 46.2% on WebWalkerQA.
29
 
30
  ## Citation
31
 
32
+ If you find this work helpful, please consider citing:
33
 
34
+ ```bibtex
35
+ @article{fan2025exploring,
36
+ title={Exploring Reasoning Reward Model for Agents},
37
+ author={Kaixuan Fan and Kaituo Feng and Manyuan Zhang and Tianshuo Peng and Zhixun Li and Yilei Jiang and Shuang Chen and Peng Pei and Xunliang Cai and Xiangyu Yue},
38
+ journal={arXiv preprint arXiv:2601.22154},
39
+ year={2025}
40
+ }
41
+ ```