Add library_name and pipeline_tag metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +9 -262
README.md CHANGED
@@ -1,9 +1,12 @@
1
  ---
2
  license: apache-2.0
 
 
3
  tags:
4
- - text-generation-inference
5
  - agent
 
6
  ---
 
7
  # AgentCPM-Report: Gemini-2.5-pro-DeepResearch Level Local DeepResearch
8
 
9
  <p align="center">
@@ -14,6 +17,10 @@ tags:
14
  <a href='https://arxiv.org/abs/2602.06540'><img src='https://img.shields.io/badge/arXiv-2602.06540-red'>
15
  </p>
16
 
 
 
 
 
17
  ## Links & Resources
18
  ### 📊 AgentCPM-Report Models
19
  - **[AgentCPM-Report](https://huggingface.co/openbmb/AgentCPM-Report)** The Gemini-2.5-pro-DeepResearch Level Local DeepResearch Model
@@ -74,271 +81,11 @@ You can read more tutorials about AgentCPM-Report in the [documentation](https:/
74
 
75
 
76
  ## Evaluation
77
- <table align="center">
78
- <thead>
79
- <tr>
80
- <th align="center">DeepResearch Bench</th>
81
- <th align="center">Overall</th>
82
- <th align="center">Comprehensiveness</th>
83
- <th align="center">Insight</th>
84
- <th align="center">Instruction Following</th>
85
- <th align="center">Readability</th>
86
- </tr>
87
- </thead>
88
- <tbody>
89
- <tr>
90
- <td align="center">Doubao-research</td>
91
- <td align="center">44.34</td>
92
- <td align="center">44.84</td>
93
- <td align="center">40.56</td>
94
- <td align="center">47.95</td>
95
- <td align="center">44.69</td>
96
- </tr>
97
- <tr>
98
- <td align="center">Claude-research</td>
99
- <td align="center">45.00</td>
100
- <td align="center">45.34</td>
101
- <td align="center">42.79</td>
102
- <td align="center">47.58</td>
103
- <td align="center">44.66</td>
104
- </tr>
105
- <tr>
106
- <td align="center">OpenAI-deepresearch</td>
107
- <td align="center">46.45</td>
108
- <td align="center">46.46</td>
109
- <td align="center">43.73</td>
110
- <td align="center">49.39</td>
111
- <td align="center">47.22</td>
112
- </tr>
113
- <tr>
114
- <td align="center">Gemini-2.5-Pro-deepresearch</td>
115
- <td align="center">49.71</td>
116
- <td align="center">49.51</td>
117
- <td align="center">49.45</td>
118
- <td align="center">50.12</td>
119
- <td align="center">50.00</td>
120
- </tr>
121
- <tr>
122
- <td align="center">WebWeaver(Qwen3-30B-A3B)</td>
123
- <td align="center">46.77</td>
124
- <td align="center">45.15</td>
125
- <td align="center">45.78</td>
126
- <td align="center">49.21</td>
127
- <td align="center">47.34</td>
128
- </tr>
129
- <tr>
130
- <td align="center">WebWeaver(Claude-Sonnet-4)</td>
131
- <td align="center">50.58</td>
132
- <td align="center">51.45</td>
133
- <td align="center">50.02</td>
134
- <td align="center">50.81</td>
135
- <td align="center">49.79</td>
136
- </tr>
137
- <tr>
138
- <td align="center">Enterprise-DR(Gemini-2.5-Pro)</td>
139
- <td align="center">49.86</td>
140
- <td align="center">49.01</td>
141
- <td align="center">50.28</td>
142
- <td align="center">50.03</td>
143
- <td align="center">49.98</td>
144
- </tr>
145
- <tr>
146
- <td align="center">RhinoInsigh(Gemini-2.5-Pro)</td>
147
- <td align="center">50.92</td>
148
- <td align="center">50.51</td>
149
- <td align="center">51.45</td>
150
- <td align="center">51.72</td>
151
- <td align="center">50.00</td>
152
- </tr>
153
- <tr>
154
- <td align="center">AgentCPM-Report</td>
155
- <td align="center">50.11</td>
156
- <td align="center">50.54</td>
157
- <td align="center">52.64</td>
158
- <td align="center">48.87</td>
159
- <td align="center">44.17</td>
160
- </tr>
161
- </tbody>
162
- </table>
163
-
164
-
165
-
166
- <table align="center">
167
- <thead>
168
- <tr>
169
- <th align="center">DeepResearch Gym</th>
170
- <th align="center">Avg.</th>
171
- <th align="center">Clarity</th>
172
- <th align="center">Depth</th>
173
- <th align="center">Balance</th>
174
- <th align="center">Breadth</th>
175
- <th align="center">Support</th>
176
- <th align="center">Insightfulness</th>
177
- </tr>
178
- </thead>
179
- <tbody>
180
- <tr>
181
- <td align="center">Doubao-research</td>
182
- <td align="center">84.46</td>
183
- <td align="center">68.85</td>
184
- <td align="center">93.12</td>
185
- <td align="center">83.96</td>
186
- <td align="center">93.33</td>
187
- <td align="center">84.38</td>
188
- <td align="center">83.12</td>
189
- </tr>
190
- <tr>
191
- <td align="center">Claude-research</td>
192
- <td align="center">80.25</td>
193
- <td align="center">86.67</td>
194
- <td align="center">96.88</td>
195
- <td align="center">84.41</td>
196
- <td align="center">96.56</td>
197
- <td align="center">26.77</td>
198
- <td align="center">90.22</td>
199
- </tr>
200
- <tr>
201
- <td align="center">OpenAI-deepresearch</td>
202
- <td align="center">91.27</td>
203
- <td align="center">84.90</td>
204
- <td align="center">98.10</td>
205
- <td align="center">89.80</td>
206
- <td align="center">97.40</td>
207
- <td align="center">88.40</td>
208
- <td align="center">89.00</td>
209
- </tr>
210
- <tr>
211
- <td align="center">Gemini-2.5-pro-deepresearch</td>
212
- <td align="center">96.02</td>
213
- <td align="center">90.71</td>
214
- <td align="center">99.90</td>
215
- <td align="center">93.37</td>
216
- <td align="center">99.69</td>
217
- <td align="center">95.00</td>
218
- <td align="center">97.45</td>
219
- </tr>
220
- <tr>
221
- <td align="center">WebWeaver (Qwen3-30b-a3b)</td>
222
- <td align="center">77.27</td>
223
- <td align="center">71.88</td>
224
- <td align="center">85.51</td>
225
- <td align="center">75.80</td>
226
- <td align="center">84.78</td>
227
- <td align="center">63.77</td>
228
- <td align="center">81.88</td>
229
- </tr>
230
- <tr>
231
- <td align="center">WebWeaver (Claude-sonnet-4)</td>
232
- <td align="center">96.77</td>
233
- <td align="center">90.50</td>
234
- <td align="center">99.87</td>
235
- <td align="center">94.30</td>
236
- <td align="center">100.00</td>
237
- <td align="center">98.73</td>
238
- <td align="center">97.22</td>
239
- </tr>
240
- <tr>
241
- <td align="center">AgentCPM-Report</td>
242
- <td align="center">98.48</td>
243
- <td align="center">95.10</td>
244
- <td align="center">100.00</td>
245
- <td align="center">98.50</td>
246
- <td align="center">100.00</td>
247
- <td align="center">97.30</td>
248
- <td align="center">100.00</td>
249
- </tr>
250
- </tbody>
251
- </table>
252
-
253
- <table align="center">
254
- <thead>
255
- <tr>
256
- <th align="center">DeepConsult</th>
257
- <th align="center">Avg.</th>
258
- <th align="center">Win</th>
259
- <th align="center">Tie</th>
260
- <th align="center">Lose</th>
261
- </tr>
262
- </thead>
263
- <tbody>
264
- <tr>
265
- <td align="center">Doubao-research</td>
266
- <td align="center">5.42</td>
267
- <td align="center">29.95</td>
268
- <td align="center">40.35</td>
269
- <td align="center">29.70</td>
270
- </tr>
271
- <tr>
272
- <td align="center">Claude-research</td>
273
- <td align="center">4.60</td>
274
- <td align="center">25.00</td>
275
- <td align="center">38.89</td>
276
- <td align="center">36.11</td>
277
- </tr>
278
- <tr>
279
- <td align="center">OpenAI-deepresearch</td>
280
- <td align="center">5.00</td>
281
- <td align="center">0.00</td>
282
- <td align="center">100.00</td>
283
- <td align="center">0.00</td>
284
- </tr>
285
- <tr>
286
- <td align="center">Gemini-2.5-Pro-deepresearch</td>
287
- <td align="center">6.70</td>
288
- <td align="center">61.27</td>
289
- <td align="center">31.13</td>
290
- <td align="center">7.60</td>
291
- </tr>
292
- <tr>
293
- <td align="center">WebWeaver(Qwen3-30B-A3B)</td>
294
- <td align="center">4.57</td>
295
- <td align="center">28.65</td>
296
- <td align="center">34.90</td>
297
- <td align="center">36.46</td>
298
- </tr>
299
- <tr>
300
- <td align="center">WebWeaver(Claude-Sonnet-4)</td>
301
- <td align="center">6.96</td>
302
- <td align="center">66.86</td>
303
- <td align="center">10.47</td>
304
- <td align="center">22.67</td>
305
- </tr>
306
- <tr>
307
- <td align="center">Enterprise-DR(Gemini-2.5-Pro)</td>
308
- <td align="center">6.82</td>
309
- <td align="center">71.57</td>
310
- <td align="center">19.12</td>
311
- <td align="center">9.31</td>
312
- </tr>
313
- <tr>
314
- <td align="center">RhinoInsigh(Gemini-2.5-Pro)</td>
315
- <td align="center">6.82</td>
316
- <td align="center">68.51</td>
317
- <td align="center">11.02</td>
318
- <td align="center">20.47</td>
319
- </tr>
320
- <tr>
321
- <td align="center">AgentCPM-Report</td>
322
- <td align="center">6.60</td>
323
- <td align="center">57.60</td>
324
- <td align="center">13.73</td>
325
- <td align="center">28.68</td>
326
- </tr>
327
- </tbody>
328
- </table>
329
-
330
- Our evaluation datasets include DeepResearch Bench, DeepConsult, and DeepResearch Gym. The writing-time knowledge base includes about 2.7 million [Arxiv papers](https://www.kaggle.com/api/v1/datasets/download/Cornell-University/arxiv) and about 200,000 internal webpage summaries.
331
 
332
  ## Acknowledgements
333
  This project would not be possible without the support and contributions of the open-source community. During development, we referred to and used multiple excellent open-source frameworks, models, and data resources, including [verl](https://github.com/volcengine/verl), [UltraRAG](https://github.com/OpenBMB/UltraRAG), [MiniCPM4.1](https://github.com/OpenBMB/MiniCPM), and [SurveyGo](https://surveygo.modelbest.cn/).
334
 
335
- ## Contributions
336
- Project leads: Yishan Li, Wentong Chen
337
-
338
- Contributors: Yishan Li, Wentong Chen, Yukun Yan, Mingwei Li, Sen Mei, Xiaorong Wang, Kunpeng Liu, Xin Cong, Shuo Wang, Zhong Zhang, Yaxi Lu, Zhenghao Liu, Yankai Lin, Zhiyuan Liu, Maosong Sun
339
-
340
- Advisors: Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun
341
-
342
  ## Citation
343
 
344
  If **AgentCPM-Report** is helpful for your research, please cite it as follows:
 
1
  ---
2
  license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
  tags:
 
6
  - agent
7
+ - text-generation-inference
8
  ---
9
+
10
  # AgentCPM-Report: Gemini-2.5-pro-DeepResearch Level Local DeepResearch
11
 
12
  <p align="center">
 
17
  <a href='https://arxiv.org/abs/2602.06540'><img src='https://img.shields.io/badge/arXiv-2602.06540-red'>
18
  </p>
19
 
20
+ This repository contains **AgentCPM-Report**, an 8B-parameter deep research agent introduced in the paper [AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research](https://arxiv.org/abs/2602.06540).
21
+
22
+ AgentCPM-Report uses a **Writing As Reasoning Policy (WARP)** to dynamically revise outlines during report generation, alternating between evidence-based drafting and reasoning-driven deepening to produce high-quality, long-form research reports.
23
+
24
  ## Links & Resources
25
  ### 📊 AgentCPM-Report Models
26
  - **[AgentCPM-Report](https://huggingface.co/openbmb/AgentCPM-Report)** The Gemini-2.5-pro-DeepResearch Level Local DeepResearch Model
 
81
 
82
 
83
  ## Evaluation
84
+ Experiments on DeepResearch Bench, DeepConsult, and DeepResearch Gym demonstrate that AgentCPM-Report outperforms leading closed-source systems, with substantial gains in Insight. Detailed benchmark results can be found in the associated research paper.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
  ## Acknowledgements
87
  This project would not be possible without the support and contributions of the open-source community. During development, we referred to and used multiple excellent open-source frameworks, models, and data resources, including [verl](https://github.com/volcengine/verl), [UltraRAG](https://github.com/OpenBMB/UltraRAG), [MiniCPM4.1](https://github.com/OpenBMB/MiniCPM), and [SurveyGo](https://surveygo.modelbest.cn/).
88
 
 
 
 
 
 
 
 
89
  ## Citation
90
 
91
  If **AgentCPM-Report** is helpful for your research, please cite it as follows: