Update README.md
Browse files
README.md
CHANGED
|
@@ -39,7 +39,7 @@ We provide a rich, structured QA dataset derived from Wikipedia knowledge graphs
|
|
| 39 |
In the supervised fine-tuning (SFT) stage, a **turn-level judge** identifies which interaction turns should be used for training.
|
| 40 |
In the reinforcement learning (RL) stage, a **turn-level reward** provides feedback to refine the model's multi-turn search and tool invocation behavior.
|
| 41 |
|
| 42 |
-
> **Note:** *
|
| 43 |
|
| 44 |
# Benchmark Results
|
| 45 |
|
|
@@ -54,8 +54,6 @@ In the reinforcement learning (RL) stage, a **turn-level reward** provides feedb
|
|
| 54 |
| **ToolMind-Web-3B(w Synthetic QA only)** | 0.583 | 0.144 | 0.301 | 0.224 | 0.36 | 0.76 | 0.3 | 0.308 |
|
| 55 |
| **ToolMind-Web-3B** | 0.670 | 0.174 | 0.308 | 0.248 | 0.477 | 0.751 | 0.37 | 0.458 |
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
# <span id="Limitations">Limitations</span>
|
| 60 |
|
| 61 |
While we place great emphasis on the safety of the model during the training process, striving to ensure that its outputs align with ethical and legal requirements, it may not completely avoid generating unexpected outputs due to the model's size and probabilistic nature. These outputs may include harmful content such as bias or discrimination. Please don't propagate such content. We do not assume any responsibility for the consequences resulting from the dissemination of inappropriate information.
|
|
|
|
| 39 |
In the supervised fine-tuning (SFT) stage, a **turn-level judge** identifies which interaction turns should be used for training.
|
| 40 |
In the reinforcement learning (RL) stage, a **turn-level reward** provides feedback to refine the model's multi-turn search and tool invocation behavior.
|
| 41 |
|
| 42 |
+
<!-- > **Note:** *report, and evaluation details will be released soon.* -->
|
| 43 |
|
| 44 |
# Benchmark Results
|
| 45 |
|
|
|
|
| 54 |
| **ToolMind-Web-3B(w Synthetic QA only)** | 0.583 | 0.144 | 0.301 | 0.224 | 0.36 | 0.76 | 0.3 | 0.308 |
|
| 55 |
| **ToolMind-Web-3B** | 0.670 | 0.174 | 0.308 | 0.248 | 0.477 | 0.751 | 0.37 | 0.458 |
|
| 56 |
|
|
|
|
|
|
|
| 57 |
# <span id="Limitations">Limitations</span>
|
| 58 |
|
| 59 |
While we place great emphasis on the safety of the model during the training process, striving to ensure that its outputs align with ethical and legal requirements, it may not completely avoid generating unexpected outputs due to the model's size and probabilistic nature. These outputs may include harmful content such as bias or discrimination. Please don't propagate such content. We do not assume any responsibility for the consequences resulting from the dissemination of inappropriate information.
|