File size: 20,971 Bytes
8c6097b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 |
# openPangu-Embedded-7B-DeepDiver
[中文](README.md) | English
📑[Technical Report](https://ai.gitcode.com/ascend-tribe/openPangu-Embedded-7B-DeepDiver/blob/main/docs/openpangu-deepdiver-v2-tech-report.pdf)
## 1. Introduction
DeepDiver is an agentic solution within openPangu series aimed at deep information seeking and processing, which natively supports the Multi-Agent System (MAS) and is designed for complex question answering and long-form report writing.
### Features
- 🔍 Supports QA Mode: Capable of answering 100+ steps of complex knowledge-based questions.
- ✍️ Supports Long-form Writing Mode: Enables the creation of articles and reports with over 3w+ words.
- 🔄 Supports Adaptive Mode: Automatically selects between QA Mode and Long-form Writing Mode based on user queries.
## 2. Results
| Benchmark | Metric | openPangu-7B-DeepDiver|
| :------------: | :-----------------: | :--------: |
| **BrowseComp-zh** | Acc | 18.3 |
| **BrowseComp-en** | Acc | 8.3 |
| **XBench-DeepSearch** | Acc | 39.0 |
Note: The table above only displays the results of complex QA. For the evaluation results of long-form report writing, please refer to the [technical report](https://ai.gitcode.com/ascend-tribe/openPangu-Embedded-7B-DeepDiver/blob/main/docs/openpangu-deepdiver-v2-tech-report.pdf)
## 3. Quick Start
### 3.1 Setup
```bash
# Clone and install
git clone <repository-url>
cd deepdiver_v2
pip install -r requirements.txt
```
### 3.2 Deployment of the Inference Service
#### Pull Images
```
docker pull quay.io/ascend/vllm-ascend:v0.9.2rc1
```
Or follow the [official documentation](https://vllm-ascend.readthedocs.io/en/stable/installation.html) to build the docker container manually.
#### Run Docker Container
```
docker run -itd --name vllm-deepdiver \
--network host \
--device /dev/davinci0 \
--device /dev/davinci1 \
--device /dev/davinci2 \
--device /dev/davinci3 \
--device /dev/davinci4 \
--device /dev/davinci5 \
--device /dev/davinci6 \
--device /dev/davinci7 \
-u root \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
-v /usr/local/dcmi:/usr/local/dcmi:ro \
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool:ro \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi:ro \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/:ro \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info:ro \
-v /etc/ascend_install.info:/etc/ascend_install.info:ro \
-v /usr/local/Ascend/firmware:/usr/local/Ascend/firmware:ro \
-v /data:/data:ro \
-v /home/work:/home/work \ # set a working dir
quay.io/ascend/vllm-ascend:v0.9.2rc1
```
#### Enter the Container
```
docker exec -itu root vllm-deepdiver bash
```
Note that `-itu root` is necessary.
#### Copy Pangu's Modeling Files
`open_pangu.py` and `__init__.py` can be found at [here](https://ai.gitcode.com/ascend-tribe/openpangu-embedded-7b-model/tree/main/inference/vllm_ascend/models)
```
cp ./vllm_ascend/open_pangu.py /vllm-workspace/vllm-ascend/vllm_ascend/models/
cp ./vllm_ascend/__init__.py /vllm-workspace/vllm-ascend/vllm_ascend/models/
```
#### Start Deployment
```
PRECHECKPOINT_PATH="path/to/deepdiver_model"
export VLLM_USE_V1=1
export VLLM_WORKER_MULTIPROC_METHOD=fork
# export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
vllm serve $PRECHECKPOINT_PATH \
--served-model-name ${SERVED_MODEL_NAME:=pangu_auto} \
--tensor-parallel-size ${tensor_parallel_size:=8} \
--trust-remote-code \
--host 127.0.0.1 \
--port 8888 \
--max-num-seqs 256 \
--max-model-len ${MAX_MODEL_LEN:=131072} \
--max-num-batched-tokens ${MAX_NUM_BATCHED_TOKENS:=4096} \
--tokenizer-mode "slow" \
--dtype bfloat16 \
--distributed-executor-backend mp \
--gpu-memory-utilization 0.93 \
```
#### Test Deployment
```
curl -X POST http://127.0.0.1:8888/v1/completions -H "Content-Type: application/json" -d '{
"model": "pangu_auto",
"prompt": ["Tell me who you are?"],
"max_tokens": 50
}'
```
### 3.3 Implement Required Tools
Before starting the server, you must implement custom logic for web search and URL crawling tools.
#### Web Search (`_generic_search`)
**Location**: `src/tools/mcp_tools.py` - `_generic_search` method
Replace the `NotImplementedError` with your search API integration:
```python
def _generic_search(self, query: str, max_results: int, config: Dict[str, Any]) -> MCPToolResult:
"""Your custom search implementation - based on the commented code example"""
try:
# Example implementation for search API:
url = config.get('base_url', 'https://api.search-provider.com/search')
payload = json.dumps({"q": query, "num": max_results})
api_keys = config.get('api_keys', [])
headers = {
'X-API-KEY': random.choice(api_keys),
'Content-Type': 'application/json'
}
response = requests.post(url, data=payload, headers=headers)
response.raise_for_status()
# Transform your API response to required format
search_results = {
"organic": [
{
"title": result["title"],
"link": result["link"],
"snippet": result["snippet"],
"date": result.get("date", "unknown")
}
for result in response.json().get("organic", [])
]
}
return MCPToolResult(success=True, data=search_results)
except Exception as e:
return MCPToolResult(success=False, error=f"Generic search failed: {e}")
```
#### URL Crawler (`url_crawler` and `_content_extractor`)
**Location**: `src/tools/mcp_tools.py` - `_content_extractor`
Replace the `NotImplementedError` section with your crawler API integration:
```python
# Example implementation for content extractor:
crawler_url = f"{crawler_config.get('base_url', 'https://api.content-extractor.com')}/{url}"
response = requests.get(crawler_url, headers=headers, timeout=crawler_config.get('timeout', 30))
response.raise_for_status()
content = response.text
# Truncate if needed
if max_tokens and len(content.split()) > max_tokens:
words = content.split()[:max_tokens]
content = ' '.join(words) + '...'
return MCPToolResult(success=True, data=content)
```
#### ⚠️ **Third-Party Service Notice**
**Important**: Search and crawler tools use external APIs (your choice). We're not responsible for:
- Privacy/security issues with third-party services
- Legal compliance with search/crawling activities
- Content accuracy or copyright issues
- API downtime or changes
Use these services at your own risk. Check their terms and privacy policies.
### 3.4 Mandatory Configuration
#### Configure the.env file
Copy `env.template` to `config/.env` and configure these options:
```bash
# LLM Service
MODEL_REQUEST_URL=http://localhost:8888/v1/chat/completions # Your LLM endpoint
# Agent Limits
PLANNER_MODE=auto # Switching between the auto mode, writing mode, or qa mode.
# External APIs (implement functions first)
SEARCH_ENGINE_BASE_URL= # Search API endpoint
SEARCH_ENGINE_API_KEYS= # Search API keys
URL_CRAWLER_BASE_URL= # Crawler API endpoint
URL_CRAWLER_API_KEYS= # Crawler API keys
```
**⚠️ Important:**
- Please configure the URL for deploying the inference service from the previous step in `MODEL_REQUEST_URL`
- Specify the mode in `PLANNER_MODE`. The `auto` mode is designed to automatically determine whether to answer complex questions or generate long-form reports. However, if you wish to prioritize long-form writing, you can set the PLANNER_MODE to ```writing```. Alternatively, if you want to focus solely on solving highly complex problems, configure the mode as ```qa```
### 3.5 Start the Tool Server
```bash
python src/tools/mcp_server_standard.py
```
### 3.6 Run the Demo
```bash
# Interactive mode
python cli/demo.py
# Single query
python cli/demo.py -q "$your_query"
```
Based on the above steps, DeepDiver can be quickly executed. If further development is required, you can refer to [Section 4](#4-customized-tool-development-guide) and [5](#5-customized-configuration).
## 4. Customized Tool Development Guide
Currently, tools are mainly categorized into Built-in Tools and External MCP Tools. Built-in Tools primarily include task assignment, think/reflect, etc. External MCP Tools are extensions that enhance LLM capabilities, such as web search, url crawl, file download, read, and write.
### 4.1 Implemented Tool Categories
#### A. External MCP Tools
Web Search and Data Collection:
- `batch_web_search`: Multi-query web search
- `url_crawler`: Extract content from URLs
- `download_files`: Download files from URLs
File Operations:
- `file_read`, `file_write`: Basic file I/O
- `list_workspace`: Directory listing
Document Processing and Content Creation:
- `document_qa`: Question-answering on documents
- `document_extract`: Extract text from various formats
- `section_writer`: Structured content generation
#### B. Built-in Tools
- `think`, `reflect`: Reasoning and planning
- `task_done`: Task completion reporting
- `assign_task_xxx`: Assign tasks and create sub-agents
### 4.2 Develop and Integrate New External MCP Tools
#### A. Implementing a New MCP Tool
Location: `src/tools/mcp_tools.py` - Add a method to the `MCPTools` class
```python
def your_new_tool(self, param1: str, param2: int) -> MCPToolResult:
"""
Description of what your tool does.
Args:
param1: Description of parameter 1
param2: Description of parameter 2
Returns:
MCPToolResult: Standardized result format
"""
try:
# Your tool implementation here
result_data = {
"output": "Tool result",
"processed_items": param2
}
return MCPToolResult(
success=True,
data=result_data,
metadata={"tool_name": "your_new_tool"}
)
except Exception as e:
logger.error(f"Tool execution failed: {e}")
return MCPToolResult(
success=False,
error=f"Tool failed: {str(e)}"
)
```
#### B. Registering the Tool on the Server
##### Adding Tool Schema
Location: `src/tools/mcp_tools.py` - Add to the `MCP_TOOL_SCHEMAS` dictionary
```python
MCP_TOOL_SCHEMAS = {
# ... existing tools ...
"your_new_tool": {
"name": "your_new_tool",
"description": "Brief description of what your tool does",
"inputSchema": {
"type": "object",
"properties": {
"param1": {
"type": "string",
"description": "Description of parameter 1"
},
"param2": {
"type": "integer",
"default": 10,
"description": "Description of parameter 2"
}
},
"required": ["param1"]
}
}
}
```
##### Registering the Tool Function
Location: `src/tools/mcp_server_standard.py` - Add to `get_tool_function()`
```python
def get_tool_function(tool_name: str):
"""Get the actual function for a tool"""
tool_map = {
# ... existing tools ...
"your_new_tool": lambda tools, **kwargs: tools.your_new_tool(**kwargs),
}
return tool_map.get(tool_name)
```
#### C. Making the Tool Accessible to Specific Agents
The visibility of tools to each agent is controlled by the predefined tool sets in the MCP client.
Location: `src/tools/mcp_client.py` - Modify the tool sets for each agent
```python
# Define which MCP server tools each agent can access
PLANNER_AGENT_TOOLS = [
"download_files",
"document_qa",
"file_read",
"file_write",
"str_replace_based_edit_tool",
"list_workspace",
"file_find_by_name",
"your_new_tool", # Add your new tool here
]
INFORMATION_SEEKER_TOOLS = [
"batch_web_search",
"url_crawler",
"document_extract",
"document_qa",
"download_files",
"file_read",
"file_write",
"str_replace_based_edit_tool",
"list_workspace",
"file_find_by_name",
"your_new_tool", # Add your new tool here if needed
]
WRITER_AGENT_TOOLS = [
"file_read",
"list_workspace",
"file_find_by_name",
"search_result_classifier",
"section_writer",
"concat_section_files",
# Add your tool if the writer agent needs it
]
```
### 4.3 Adding Built-in Agent Tools/Functions
#### A. Tools/Functions with Actual Return Values
Agents in DeepDiver (e.g., the Planner) integrate built-in functions as tools, such as `assign_subjective_task_to_writer` and `assign_multi_objective_tasks_to_info_seeker`. In addition to their specific implementations, these functions require adding **agent-specific tool schemas** using `_build_agent_specific_tool_schemas()`.
Location: `src/agents/your_agent.py`
```python
def _build_agent_specific_tool_schemas(self) -> List[Dict[str, Any]]:
"""Add built-in agent functions (not MCP server tools)"""
# Get base schemas from MCP server via client
schemas = super()._build_agent_specific_tool_schemas()
# Add agent-specific built-in functions like task assignment, completion reporting
builtin_functions = [
{
"type": "function",
"function": {
"name": "agent_specific_task_done",
"description": "Report task completion for this agent",
"parameters": {
"type": "object",
"properties": {
"result": {"type": "string", "description": "Task result"},
"status": {"type": "string", "description": "Completion status"}
},
"required": ["result", "status"]
}
}
}
]
schemas.extend(builtin_functions)
return schemas
```
#### B. Built-in Tools with Pseudo Return Values
Cognitive tools in DeepDiver (e.g., `think` and `reflect`) have no specific implementation. When an agent calls these tools, the tool invocation is considered complete once the agent generates the tool's input parameters. You can directly return a result after the model generates the input parameters, allowing the model to continue with subsequent tasks (refer to the implementation of `_execute_react_loop()` in `planner_agent.py`):
```python
if tool_call["name"] in ["think", "reflect"]:
tool_result = {"tool_results": "You can proceed to invoke other tools if needed. "}
```
Similarly, such built-in tools also require adding their exclusive tool schemas using `_build_agent_specific_tool_schemas()`.
## 5. Customized Configuration
### 5.1 Client Configuration
Copy `env.template` to `config/.env` and configure these options:
```bash
# LLM Service
MODEL_REQUEST_URL=http://localhost:8000 # Your LLM endpoint
MODEL_REQUEST_TOKEN=your-token # Auth token
MODEL_NAME=pangu_auto # Model name
MODEL_TEMPERATURE=0.3 # Response randomness (0.0-1.0)
MODEL_MAX_TOKENS=8192 # Max response length
MODEL_REQUEST_TIMEOUT=60 # Request timeout (seconds)
# Agent Limits
PLANNER_MAX_ITERATION=40 # Planner maximum ReAct steps
INFORMATION_SEEKER_MAX_ITERATION=30 # Info seeker maximum ReAct steps
WRITER_MAX_ITERATION=40 # Writer maximum ReAct steps
PLANNER_MODE=auto # Switching between the auto mode, long-form writing - priority mode, or the qa - priority mode.
# MCP Server
MCP_SERVER_URL=http://localhost:6274/mcp # MCP server endpoint
MCP_USE_STDIO=false # Use stdio vs HTTP
# External APIs (implement functions first)
SEARCH_ENGINE_BASE_URL= # Search API endpoint
SEARCH_ENGINE_API_KEYS= # Search API keys
URL_CRAWLER_BASE_URL= # Crawler API endpoint
URL_CRAWLER_API_KEYS= # Crawler API keys
URL_CRAWLER_MAX_TOKENS=100000 # Max crawled content length
# Storage Paths
TRAJECTORY_STORAGE_PATH=./workspace # Agent work directory
REPORT_OUTPUT_PATH=./report # Report output directory
DOCUMENT_ANALYSIS_PATH=./doc_analysis # Document analysis directory
# System
DEBUG_MODE=false # Enable debug logging
MAX_RETRIES=3 # API retry attempts
TIMEOUT=30 # General timeout (seconds)
```
### 5.2 Server Configuration (server_config.yaml)
The `server_config.yaml` file controls server behavior, tool rate limiting, and operational settings:
#### Core Server Settings
```yaml
server:
host: "127.0.0.1" # Server bind address
port: 6274 # Server port
debug_mode: false # Enable debug logging
session_ttl_seconds: 21600 # Session timeout (6 hours)
max_sessions: 1000 # Max concurrent sessions
```
#### Tool Rate Limiting
Controls external API usage across all sessions:
```yaml
tool_rate_limits:
batch_web_search:
requests_per_minute: 9000 # Per-minute limit
burst_limit: 35 # Short-term burst allowance
url_crawler:
requests_per_minute: 9000
burst_limit: 60
```
#### Session Management
```yaml
server:
cleanup_interval_seconds: 600 # Clean expired sessions (5 min)
enable_session_keepalive: true # Keep sessions alive during long operations
keepalive_touch_interval: 300 # Touch session every N seconds
```
#### Security & Performance
```yaml
server:
request_timeout_seconds: 1800 # Request timeout
max_request_size_mb: 1000 # Maximum request size
rate_limit_requests_per_minute: 300000 # Requests per IP
```
The configuration file includes detailed comments explaining each setting. Modify values based on your deployment requirements and external API limits.
## 6. Model License
Unless otherwise noted, openPangu-Embedded-7B-DeepDiver model is licensed under the terms and conditions of OPENPANGU MODEL LICENSE AGREEMENT VERSION 1.0, which is intended to be used permissively and enable the further development of artificial intelligence technologies. Please refer to the [LICENSE](LICENSE) file located in the root directory of the model repository for details.
## 7. Security Notice and Disclaimer
Due to the inherent technical limitations of the technologies relied upon by the openPangu-Embedded-7B-DeepDiver model and its framework, as well as the fact that AI-generated content is automatically produced by Pangu, Huawei cannot make any warranties regarding the following matters:
- The output of this Model is automatically generated via AI algorithms, it does not rule out the possibility that some of the information may be flawed, unreasonable, or cause discomfort, and the generated content does not represent Huawei's attitude or standpoint;
- There is no guarantee that this Model is 100% accurate, reliable, functional, timely, secure and safety, error-free, uninterrupted, continuously stable, or free of any faults;
- The output of this Model does not constitute any advices or decisions for you, and it does not guarantee the authenticity, completeness, accuracy, timeliness, legality, functionality, or practicality of the generated content. The generated content cannot replace professionals in medical, legal, and other fields in answering your questions. The generated content is for your reference only and does not represent any attitude, standpoint, or position of Huawei. You need to make independent judgments based on your actual situation, and Huawei does not assume any responsibilities;
- The inter-component communication of the DeepDiver MAS system does not include built-in data encryption or authentication mechanisms (e.g., tokens, signatures). You shall independently assess your security requirements and implement corresponding protective measures (such as deploying the system in an encrypted network, integrating SSL/TLS protocols, and enforcing component identity verification);
- Any security incidents (including but not limited to data leakage, unauthorized access, and business losses) arising from the lack of encryption/authentication mechanisms shall be borne by the user of the system. Huawei shall bear no responsibility therefor.
## 8. Contact Us
If you have any comments or suggestions, please submit an issue or contact openPangu@huawei.com.
--- |