Nanbeige
/

ToolMind-Web-3B

Text Generation

Model card Files Files and versions

flust commited on Jan 23

Commit

7e9a8ea

·

verified ·

1 Parent(s): 79ce992

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ We provide a rich, structured QA dataset derived from Wikipedia knowledge graphs
 In the supervised fine-tuning (SFT) stage, a **turn-level judge** identifies which interaction turns should be used for training.
 In the reinforcement learning (RL) stage, a **turn-level reward** provides feedback to refine the model's multi-turn search and tool invocation behavior.
-*Training data, full report, and evaluation details will be released soon.*
 # Benchmark Results

 In the supervised fine-tuning (SFT) stage, a **turn-level judge** identifies which interaction turns should be used for training.
 In the reinforcement learning (RL) stage, a **turn-level reward** provides feedback to refine the model's multi-turn search and tool invocation behavior.
+> **Note:** *Training data, full report, and evaluation details will be released soon.*
 # Benchmark Results