fangwu97
/

DeepSearch-1.5B

Text Generation

reinforcement-learning

Eval Results (legacy)

text-generation-inference

Model card Files Files and versions

fangwu97 commited on Oct 20, 2025

Commit

161c1ad

·

verified ·

1 Parent(s): e77dab2

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -56,6 +56,7 @@ This model achieves **state-of-the-art accuracy among 1.5B reasoning models** wh
 - **Developed by**: Fang Wu\*, Weihao Xuan\*, Heli Qi\*, Ximing Lu, Aaron Tu, Li Erran Li, Yejin Choi
 - **Institutional affiliations**: Stanford University, University of Tokyo, RIKEN AIP, University of Washington, UC Berkeley, Amazon AWS, Columbia University
 - **Paper**: [DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search](https://huggingface.co/papers/2509.25454)
 - **Base Model**: Nemotron-Research-Reasoning-Qwen-1.5B v2
 - **Parameters**: 1.5B
 - **Framework**: veRL

 - **Developed by**: Fang Wu\*, Weihao Xuan\*, Heli Qi\*, Ximing Lu, Aaron Tu, Li Erran Li, Yejin Choi
 - **Institutional affiliations**: Stanford University, University of Tokyo, RIKEN AIP, University of Washington, UC Berkeley, Amazon AWS, Columbia University
 - **Paper**: [DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search](https://huggingface.co/papers/2509.25454)
+- **Code**: [Github](https://github.com/smiles724/DeepSearch)
 - **Base Model**: Nemotron-Research-Reasoning-Qwen-1.5B v2
 - **Parameters**: 1.5B
 - **Framework**: veRL