These are the training checkpoint results for OpenRLHF-Agent/Search-R1.
- Downloads last month
- 13
Model tree for jiulaikankan/Qwen3-4B-Thinking-Search-R1-baseline
Base model
Qwen/Qwen3-4B-Thinking-2507These are the training checkpoint results for OpenRLHF-Agent/Search-R1.
Base model
Qwen/Qwen3-4B-Thinking-2507