| # ASearcher-Web-QwQ-V2 | |
| ## Overview | |
| **ASearcher-Web-QwQ-V2** is a 32B-scale search agent trained using large-scale reinforcement learning. This model represents an improved version of the ASearcher framework, achieving cutting-edge performance on challenging web search benchmarks through advanced agentic RL training techniques. | |
| ## Key Features | |
| - π **Cutting-Edge Performance**: Achieves Avg@4 scores of 58.7, 51.1, and 74.5 on GAIA, xBench-DeepSearch, and Frames benchmarks respectively | |
| - β‘ **Fully Asynchronous RL Training**: Enables efficient long-horizon search capabilities with tool calls exceedind 100 rounds | |
| - π **Advanced Data Synthesis**: Trained on autonomously generated QA pairs with rigorous multi-stage validation | |
| - π **Real Web Search Capabilities**: Designed to interact with live web search tools for up-to-date information retrieval | |
| ## Performance Highlights | |
| | Benchmark | Avg@4 Score | Pass@4 Score | | |
| |-----------|------------|-------------| | |
| | GAIA | 58.7 | 74.7 | | |
| | xBench-DeepSearch | 51.1 | 75.0 | | |
| | Frames | 74.5 | 85.5 | | |
| **Substantial RL Improvements**: Reinforcement learning training brings significant gains: | |
| - +15.0 improvement on GAIA | |
| - +22.4 improvement on xBench-DeepSearch | |
| - +14.6 improvement on Frames | |
| ## Quick Start | |
| ### Evaluation | |
| To reproduce the benchmark results: | |
| ```bash | |
| cd evaluation/ | |
| python search_eval_async.py \ | |
| --model_name_or_path inclusionAI/ASearcher-Web-QwQ-V2 \ | |
| --data_names GAIA,xbench-deepsearch,Frames \ | |
| --agent-type asearcher-reasoning \ | |
| --search-client-type async-web-search-access | |
| ``` | |
| ## Training Details | |
| This model was trained using: | |
| - **Architecture**: QwQ-32B | |
| - **Training Method**: Fully asynchronous reinforcement learning | |
| - **Data**: Synthesized QA pairs with multi-stage validation | |
| - **Framework**: AReaL | |
| ## Applications | |
| - Complex web search and information retrieval | |
| - Multi-step problem solving with tool usage | |
| - Real-time information gathering and synthesis | |
| - Long-horizon reasoning tasks | |
| ## Citation | |
| If you use this model, please cite: | |
| ```bibtex | |
| @misc{gao2025turnsunlockinglonghorizonagentic, | |
| title={Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL}, | |
| author={Jiaxuan Gao and Wei Fu and Minyang Xie and Shusheng Xu and Chuyi He and Zhiyu Mei and Banghua Zhu and Yi Wu}, | |
| year={2025}, | |
| eprint={2508.07976}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2508.07976}, | |
| } | |
| ``` | |
| ## License | |
| --- | |
| license: apache-2.0 | |
| --- | |
| ## Contact | |
| For questions and support, please refer to the [ASearcher GitHub repository](https://github.com/inclusionAI/ASearcher) or open an issue on the project page. | |