Nanbeige
/

ToolMind-Web-3B

@@ -14,7 +14,7 @@ pipeline_tag: text-generation
 # Overview
-ToolMind-web-3B is a specialized lightweight agent built on top of the [**Nanbeige4-3B-Thinking-2511**](https://huggingface.co/Nanbeige/Nanbeige4-3B-Thinking-2511) foundation model.
 Following extensive **SFT (Supervised Fine-Tuning)** and **RL (Reinforcement Learning)** focused on search behaviors,
 our model attains leading performance among small-scale models on multiple long-horizon leaderboards like **Xbench-Deepsearch, HLE, and GAIA**, enabling reliable execution of up to hundreds of consecutive tool invocations.
@@ -28,7 +28,7 @@ our model attains leading performance among small-scale models on multiple long-
 1. **Strong Performance at Compact Scale**
-ToolMind-web-3B delivers high-quality long-horizon reasoning and tool-augmented search capabilities while maintaining a lightweight 3B parameter footprint. Despite its compact size, it achieves competitive performance across multiple benchmarks like **Xbench-Deepsearch, GAIA, and HLE**. The model is evaluated under the **[MiroThinkers workflow](https://github.com/MiroMindAI/MiroThinker)**, ensuring standardized and reproducible assessment.
 2. **An Open-Source Complex QA Dataset Synthesized from Wikipedia Entity–Relation Knowledge Graphs**
@@ -51,8 +51,8 @@ In the reinforcement learning (RL) stage, a **turn-level reward** provides feedb
 | MiroThinker 8B | 0.664 | 0.311 | 0.402 | 0.215 | 0.404 | 0.606 |  | / |
 | AgentCPM-Explore 4B | 0.639 | 0.25 | 0.29 | 0.191 | 0.4 | 0.7 | / | / |
 | **Ours**|
-| **ToolMind-web-3B(w Synthetic QA only)** | 0.583 | 0.144 | 0.301 | 0.224 | 0.36 | 0.76 | 0.3 | 0.308 |
-| **ToolMind-web-3B** | 0.670 | 0.174 | 0.308 | 0.248 | 0.477 | 0.751 | 0.37 | 0.458 |

 # Overview
+ToolMind-Web-3B is a specialized lightweight agent built on top of the [**Nanbeige4-3B-Thinking-2511**](https://huggingface.co/Nanbeige/Nanbeige4-3B-Thinking-2511) foundation model.
 Following extensive **SFT (Supervised Fine-Tuning)** and **RL (Reinforcement Learning)** focused on search behaviors,
 our model attains leading performance among small-scale models on multiple long-horizon leaderboards like **Xbench-Deepsearch, HLE, and GAIA**, enabling reliable execution of up to hundreds of consecutive tool invocations.
 1. **Strong Performance at Compact Scale**
+ToolMind-Web-3B delivers high-quality long-horizon reasoning and tool-augmented search capabilities while maintaining a lightweight 3B parameter footprint. Despite its compact size, it achieves competitive performance across multiple benchmarks like **Xbench-Deepsearch, GAIA, and HLE**. The model is evaluated under the **[MiroThinkers workflow](https://github.com/MiroMindAI/MiroThinker)**, ensuring standardized and reproducible assessment.
 2. **An Open-Source Complex QA Dataset Synthesized from Wikipedia Entity–Relation Knowledge Graphs**
 | MiroThinker 8B | 0.664 | 0.311 | 0.402 | 0.215 | 0.404 | 0.606 |  | / |
 | AgentCPM-Explore 4B | 0.639 | 0.25 | 0.29 | 0.191 | 0.4 | 0.7 | / | / |
 | **Ours**|
+| **ToolMind-Web-3B(w Synthetic QA only)** | 0.583 | 0.144 | 0.301 | 0.224 | 0.36 | 0.76 | 0.3 | 0.308 |
+| **ToolMind-Web-3B** | 0.670 | 0.174 | 0.308 | 0.248 | 0.477 | 0.751 | 0.37 | 0.458 |