Nanbeige
/

ToolMind-Web-3B

Text Generation

Model card Files Files and versions

flust commited on Feb 12

Commit

9f5aa81

·

verified ·

1 Parent(s): 16133d4

Update README.md

Files changed (1) hide show

README.md +1 -3

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ We provide a rich, structured QA dataset derived from Wikipedia knowledge graphs
 In the supervised fine-tuning (SFT) stage, a **turn-level judge** identifies which interaction turns should be used for training.
 In the reinforcement learning (RL) stage, a **turn-level reward** provides feedback to refine the model's multi-turn search and tool invocation behavior.
-> **Note:** *Training data, full report, and evaluation details will be released soon.*
 # Benchmark Results
@@ -54,8 +54,6 @@ In the reinforcement learning (RL) stage, a **turn-level reward** provides feedb
 | **ToolMind-Web-3B(w Synthetic QA only)** | 0.583 | 0.144 | 0.301 | 0.224 | 0.36 | 0.76 | 0.3 | 0.308 |
 | **ToolMind-Web-3B** | 0.670 | 0.174 | 0.308 | 0.248 | 0.477 | 0.751 | 0.37 | 0.458 |
 # <span id="Limitations">Limitations</span>
 While we place great emphasis on the safety of the model during the training process, striving to ensure that its outputs align with ethical and legal requirements, it may not completely avoid generating unexpected outputs due to the model's size and probabilistic nature. These outputs may include harmful content such as bias or discrimination. Please don't propagate such content. We do not assume any responsibility for the consequences resulting from the dissemination of inappropriate information.

 In the supervised fine-tuning (SFT) stage, a **turn-level judge** identifies which interaction turns should be used for training.
 In the reinforcement learning (RL) stage, a **turn-level reward** provides feedback to refine the model's multi-turn search and tool invocation behavior.
+<!-- > **Note:** *report, and evaluation details will be released soon.* -->
 # Benchmark Results
 | **ToolMind-Web-3B(w Synthetic QA only)** | 0.583 | 0.144 | 0.301 | 0.224 | 0.36 | 0.76 | 0.3 | 0.308 |
 | **ToolMind-Web-3B** | 0.670 | 0.174 | 0.308 | 0.248 | 0.477 | 0.751 | 0.37 | 0.458 |
 # <span id="Limitations">Limitations</span>
 While we place great emphasis on the safety of the model during the training process, striving to ensure that its outputs align with ethical and legal requirements, it may not completely avoid generating unexpected outputs due to the model's size and probabilistic nature. These outputs may include harmful content such as bias or discrimination. Please don't propagate such content. We do not assume any responsibility for the consequences resulting from the dissemination of inappropriate information.