nielsr HF Staff commited on
Commit
252a182
·
verified ·
1 Parent(s): 0a71d5a

Add metadata and improve model card

Browse files

Hi! I'm Niels from the Hugging Face community science team.

This PR improves your model card by adding metadata to the YAML header (pipeline tag, library name, and license). It also links the model to its associated paper and GitHub repository, making it easier for users to find resources and understand how to use the model.

Files changed (1) hide show
  1. README.md +28 -17
README.md CHANGED
@@ -1,22 +1,33 @@
1
- # Official Repo of Reagent Agent Reasoning Reward Model (Agent-RRM).
2
- Paper: https://arxiv.org/abs/2601.22154
 
 
 
3
 
4
- ## Abstract:
5
- Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use.
6
- However, most methods still relies on sparse outcome-based reward for training.
7
- Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results.
8
- In this paper, we introduce \textbf{Agent Reasoning Reward Model (Agent-RRM)}, a multi-faceted reward model that produces structured feedback for agentic trajectories, including (1) an explicit reasoning trace , (2) a focused critique that provides refinement guidance by highlighting reasoning flaws, and (3) an overall score that evaluates process performance.
9
- Leveraging these signals, we systematically investigate three integration strategies: \textbf{Reagent-C} (text-augmented refinement), \textbf{Reagent-R} (reward-augmented guidance), and \textbf{Reagent-U} (unified feedback integration).
10
- Extensive evaluations across 12 diverse benchmarks demonstrate that Reagent-U yields substantial performance leaps, achieving 43.7\% on GAIA and 46.2\% on WebWalkerQA, validating the effectiveness of our reasoning reward model and training schemes.
11
 
12
- ## GitHub Repository
13
- The official codebase, including training and evaluation scripts for Reagent, can be found on the project's GitHub repository: https://github.com/kxfan2002/Reagent
14
 
15
- ## Citation
 
 
 
 
16
 
17
- ```bash
18
 
19
- ```
20
- ---
21
- license: apache-2.0
22
- ---
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: reinforcement-learning
5
+ ---
6
 
7
+ # Agent Reasoning Reward Model (Agent-RRM)
 
 
 
 
 
 
8
 
9
+ This repository contains the weights for **Agent-RRM**, introduced in the paper [Exploring Reasoning Reward Model for Agents](https://huggingface.co/papers/2601.22154).
 
10
 
11
+ ## Introduction
12
+ Agent Reasoning Reward Model (Agent-RRM) is a multi-faceted reward model designed for Agentic Reinforcement Learning. Unlike traditional sparse outcome-based rewards, Agent-RRM provides structured feedback for agentic trajectories, including:
13
+ 1. **Explicit reasoning trace**: Step-by-step reasoning analysis.
14
+ 2. **Focused critique**: Refinement guidance highlighting reasoning flaws.
15
+ 3. **Overall score**: Evaluation of process performance.
16
 
17
+ These signals enable training strategies like **Reagent-U**, which has demonstrated significant performance leaps on benchmarks such as GAIA and WebWalkerQA.
18
 
19
+ ## Resources
20
+ - **Paper:** [Exploring Reasoning Reward Model for Agents](https://huggingface.co/papers/2601.22154)
21
+ - **GitHub Repository:** [kxfan2002/Reagent](https://github.com/kxfan2002/Reagent)
22
+
23
+ ## Citation
24
+ If you find this work helpful, please consider citing:
25
+
26
+ ```bibtex
27
+ @article{fan2025exploring,
28
+ title={Exploring Reasoning Reward Model for Agents},
29
+ author={Kaixuan Fan and Kaituo Feng and Manyuan Zhang and Tianshuo Peng and Zhixun Li and Yilei Jiang and Shuang Chen and Peng Pei and Xunliang Cai and Xiangyu Yue},
30
+ journal={arXiv preprint arXiv:2601.22154},
31
+ year={2025}
32
+ }
33
+ ```