File size: 6,987 Bytes
aeef444 e6373c2 aeef444 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
---
language:
- zh
- en
license: apache-2.0
tags:
- agent
- deep-research
- tool-calling
- long-context
- reasoning
pipeline_tag: text-generation
---
# S1-DeepResearch-8B-Preview
<div align="center">
[](https://www.apache.org/licenses/LICENSE-2.0)
[](https://github.com/ScienceOne-AI/S1-DeepResearch)
**面向长程深度研究任务的高效能智能体 | High-Performance Deep Research Agent**
[🤗 Model](https://huggingface.co/ScienceOne-AI/S1-DeepResearch-8B-Preview) | [🤖 ModelScope](https://modelscope.cn/models/ScienceOne-AI/S1-DeepResearch-8B-Preview) | [📖 GitHub](https://github.com/ScienceOne-AI/S1-DeepResearch) | [📄 Technical Report (Coming Soon)]()
</div>
---
## 📝 模型简介 | Model Overview
**S1-DeepResearch-8B-Preview** 是由 ScienceOne 团队开发的专门为 **长程深度研究(Deep Research)** 任务设计的开源智能体模型。它可以处理非常复杂的、需要多步推理的信息搜寻任务,并具备强大的连续多轮工具调用能力,能够像人类研究员一样,在海量信息中抽丝剥茧,进行深度的信息检索、阅读、理解与整合。
**S1-DeepResearch is an open-source agent model specifically developed for long-horizon deep research tasks.** it can handle highly complex, multi-step reasoning information retrieval tasks with powerful continuous multi-turn tool calling capabilities, working like a human researcher to perform deep information retrieval, reading, comprehension, and integration.
---
## ✨ 核心特性 | Key Features
- **⚡ 小而不凡 | Small Yet Powerful**
- 仅 **8B** 参数量级,保持高推理速度与低部署成本
- 在 Agent 任务中表现出优异的主动规划、信息整合能力
- Maintains high inference speed and low deployment cost at 8B parameter scale
- Exhibits excellent proactive planning and information integration capabilities
- **🏆 性能卓越 | Outstanding Performance**
- 在 GAIA、DeepSearch、Browsecomp 等主流深度研究类基准上表现突出
- 达到同尺寸模型的 SOTA 水平,并超越一些更大参数量的模型
- Achieves SOTA level among same-sized models on mainstream benchmarks
- Surpasses some larger parameter models
- **🔄 长链执行 | Long-Chain Execution**
- 支持 **128k** 上下文窗口
- 能够稳定执行 **100+ 轮**的连续工具调用
- 在长链路深度研究推理中保持高度的推理韧性
- 128k context length
- Capable of stably executing 100+ rounds of continuous tool calls
- **⚙️ 高质量合成数据 | High-Quality Synthetic Data**
- 创新的全自动数据合成流水线
- 采用搜索-浏览-模糊的合成流程
- 知识图谱随机游走策略,自动生成高复杂度、强搜索依赖的问答对数据
- Innovative fully automated data synthesis pipeline
---
## 📊 性能评估 | Evaluation
我们在多个权威智能体能力基准上对 S1-DeepResearch 进行了评估。结果表明,S1-DeepResearch 在同参数规模模型中达到领先水平,并在多项任务上凭借高效的推理与规划机制,超越了部分参数规模更大的开源模型。进一步地,在与闭源顶尖模型及专业 Deep Research 系统的对比中,S1-DeepResearch 依然展现出稳定且具竞争力的性能表现。整体结果表明,在合理的智能体架构与推理策略支持下,轻量化但具备高能力密度的模型同样能够有效应对复杂深度研究任务,为深度研究智能体的实际应用提供了一条可行路径。
We evaluate **S1-DeepResearch** across multiple authoritative agent benchmarks. The results show that S1-DeepResearch achieves state-of-the-art performance among models of comparable size, and in several tasks surpasses larger open-source models due to its efficient reasoning and planning mechanisms. Moreover, when compared with leading closed-source models and specialized Deep Research systems, S1-DeepResearch demonstrates consistently competitive performance. These results indicate that **lightweight models with high capability density**, when equipped with well-designed agent architectures and reasoning strategies, can effectively handle complex deep research tasks, offering a practical and scalable solution for real-world research-oriented agent systems.
<div align="center">
<img src="https://github.com/ScienceOne-AI/S1-DeepResearch/raw/main/assets/benchmark.png" width="80%" alt="Benchmark Results">
</div>
### 推理时扩展 | Test-Time Scaling
我们进一步分析了模型在推理时扩展下的性能表现。评测结果显示,S1-DeepResearch 在 Pass@1 到 Pass@3 设置下取得了稳定且显著的性能提升,表明其推理与规划过程在单次推理条件下尚未饱和。该结果说明,通过推理时扩展,模型能够探索更丰富的推理与规划路径,从而有效提升复杂任务的整体成功率。
We further analyze the model’s test-time scaling behavior. S1-DeepResearch shows consistent and significant performance gains from Pass@1 to Pass@3, indicating that its reasoning and planning processes are not saturated under single-sample test. This suggests that test-time expansion enables the model to explore diverse reasoning and planning trajectories, thereby substantially improving task success rates.
<div align="center">
<img src="https://github.com/ScienceOne-AI/S1-DeepResearch/raw/main/assets/pass1to3.png" width="80%" alt="Pass@1 to Pass@3 Results">
</div>
---
## 🛠️ 使用方式 | Usage
详细的使用说明和示例代码请参考我们的 GitHub 仓库
**For detailed usage instructions and example code, please refer to our GitHub repository:**
👉 **[https://github.com/ScienceOne-AI/S1-DeepResearch](https://github.com/ScienceOne-AI/S1-DeepResearch)**
---
---
## 📜 协议与引用 | License & Citation
### License
本模型采用 **Apache License 2.0** 开源协议。
This model is licensed under **Apache License 2.0**.
### 📑 Citation
```bibtex
@software{s1agent2025,
title={S1-DeepResearch: High-Performance Deep Research Agent},
author={ScienceOne Team},
year={2025},
url={https://github.com/ScienceOne-AI/S1-DeepResearch},
}
```
---
## 🤝 联系我们 | Contact
- **GitHub Issues**: [https://github.com/ScienceOne-AI/S1-DeepResearch/issues](https://github.com/ScienceOne-AI/S1-DeepResearch/issues)
---
## ⚠️ 免责声明 | Disclaimer
本模型为研究预览版本,可能存在局限性。在生产环境使用前请充分测试。模型输出结果仅供参考,不构成专业建议。
**This model is a research preview and may have limitations. Please test thoroughly before production use. Model outputs are for reference only and do not constitute professional advice.**
---
<div align="center">
**Made with ❤️ by ScienceOne Team**
</div>
|