nielsr HF Staff commited on
Commit
7268ec8
Β·
verified Β·
1 Parent(s): 89be68f

Improve model card: Add pipeline tag, detailed description, and code link

Browse files

This PR significantly enhances the model card for WebAggregator-32B by:

- Adding the `pipeline_tag: image-text-to-text` to improve discoverability for multimodal agent models.
- Expanding the model description with key information and features from the paper abstract and GitHub README.
- Including a direct link to the official GitHub repository for easy access to the codebase.
- Adding the BibTeX citation for proper attribution.

These updates provide more comprehensive information about the model and its usage, benefiting researchers and users.

Files changed (1) hide show
  1. README.md +40 -3
README.md CHANGED
@@ -1,9 +1,46 @@
1
  ---
 
 
2
  license: other
3
  license_name: webaggregator
4
  license_link: https://huggingface.co/CognitiveKernel/WebAggregator-32B/blob/main/LICENSE
5
- base_model:
6
- - Qwen/Qwen3-32B
7
  ---
8
 
9
- This model was the WebAggregator-32B model mentioned in the paper [Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents](https://arxiv.org/abs/2510.14438).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen3-32B
4
  license: other
5
  license_name: webaggregator
6
  license_link: https://huggingface.co/CognitiveKernel/WebAggregator-32B/blob/main/LICENSE
7
+ pipeline_tag: image-text-to-text
 
8
  ---
9
 
10
+ # WebAggregator-32B
11
+
12
+ This model is **WebAggregator-32B**, a deep research web agent presented in the paper [Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents](https://arxiv.org/abs/2510.14438).
13
+
14
+ WebAggregator models are designed to enhance the information aggregation capabilities of web agents, going beyond mere information seeking. They rigorously analyze and aggregate knowledge from diverse sources, including web environments, files, and multimodal inputs, to support in-depth research. The framework employs an "Explore to Evolve" paradigm to scalably construct verifiable training data for web agents.
15
+
16
+ The 32B variant of WebAggregator demonstrates strong performance, surpassing GPT-4.1 by more than 10% on GAIA-text and closely approaching Claude-3.7-sonnet, particularly on challenging information aggregation benchmarks where other agents struggle.
17
+
18
+ ## ✨ Features
19
+
20
+ - πŸ€– **Fully Automated and Verifiable QA Construction**: Enables scalable generation of high-quality training data for web agents.
21
+ - πŸ˜„ **Open Source**: Provides a complete codebase including the QA construction engine, queries, trajectories, and models.
22
+ - πŸ‘ **Highly Customizable**: Allows users to collect data tailored to their specific needs with minimal human effort, and easily customize their own agents.
23
+
24
+ ## πŸ”— Code Repository
25
+
26
+ The official code for the WebAggregator project can be found on GitHub: [https://github.com/Tencent/WebAggregator](https://github.com/Tencent/WebAggregator)
27
+
28
+ ## πŸš€ Getting Started
29
+
30
+ To get started with the WebAggregator project, please refer to the comprehensive instructions in the [official GitHub repository's Quick Start and Usage sections](https://github.com/Tencent/WebAggregator#quick-start). The repository provides details on cloning, installing dependencies, configuring, and running evaluation, QA construction, and trajectory sampling scripts.
31
+
32
+ ## πŸ“š Citation
33
+
34
+ If you find this work helpful, please cite the original paper:
35
+
36
+ ```bibtex
37
+ @misc{wang2025exploreevolvescalingevolved,
38
+ title={Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents},
39
+ author={Rui Wang and Ce Zhang and Jun-Yu Ma and Jianshu Zhang and Hongru Wang and Yi Chen and Boyang Xue and Tianqing Fang and Zhisong Zhang and Hongming Zhang and Haitao Mi and Dong Yu and Kam-Fai Wong},
40
+ year={2025},
41
+ eprint={2510.14438},
42
+ archivePrefix={arXiv},
43
+ primaryClass={cs.CL},
44
+ url={https://arxiv.org/abs/2510.14438},
45
+ }
46
+ ```