Add model card for SynPlanResearch-R1-8B

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +25 -0
README.md ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ library_name: transformers
4
+ ---
5
+
6
+ # SynPlanResearch-R1-8B
7
+
8
+ SynPlanResearch-R1 is a framework designed to improve the exploration behaviors of research agents. It synthesizes tool-use trajectories that encourage deeper investigation during cold-start supervised fine-tuning (SFT), providing a robust initialization for subsequent reinforcement learning. This specific checkpoint is based on the Qwen3-8B backbone.
9
+
10
+ ## Resources
11
+ - **Paper**: [SynPlanResearch-R1: Encouraging Tool Exploration for Deep Research with Synthetic Plans](https://huggingface.co/papers/2603.07853)
12
+ - **Repository**: [https://github.com/HansiZeng/syn-plan-research](https://github.com/HansiZeng/syn-plan-research)
13
+
14
+ ## Description
15
+ Research Agents gather information from the web using tools to answer user queries, requiring them to dynamically interleave internal reasoning with tool use. While such capabilities can be learned via reinforcement learning with verifiable rewards (RLVR), agents often exhibit poor exploration behaviors, including premature termination and biased tool usage.
16
+
17
+ SynPlanResearch-R1 addresses these challenges by synthesizing trajectories that shape the model's behavior toward more comprehensive exploration. Across seven multi-hop and open-web benchmarks, this framework improves performance by up to 6.0% on Qwen3-8B and 5.8% on Qwen3-4B backbones compared to state-of-the-art baselines.
18
+
19
+ ## Training and Evaluation
20
+ The repository provides comprehensive scripts for:
21
+ - **Supervised Fine-Tuning (SFT)**: `bash examples/syn_plan_research/sft_syn_plan_research.sh`
22
+ - **Reinforcement Learning (RL)**: `bash examples/syn_plan_research/rl_syn_plan_research.sh`
23
+ - **Evaluation**: `bash examples/syn_plan_research/eval_syn_plan_research_all.sh`
24
+
25
+ For detailed environment setup and data configuration, please refer to the [official GitHub repository](https://github.com/HansiZeng/syn-plan-research).