fangwu97 commited on
Commit
161c1ad
·
verified ·
1 Parent(s): e77dab2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -56,6 +56,7 @@ This model achieves **state-of-the-art accuracy among 1.5B reasoning models** wh
56
  - **Developed by**: Fang Wu\*, Weihao Xuan\*, Heli Qi\*, Ximing Lu, Aaron Tu, Li Erran Li, Yejin Choi
57
  - **Institutional affiliations**: Stanford University, University of Tokyo, RIKEN AIP, University of Washington, UC Berkeley, Amazon AWS, Columbia University
58
  - **Paper**: [DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search](https://huggingface.co/papers/2509.25454)
 
59
  - **Base Model**: Nemotron-Research-Reasoning-Qwen-1.5B v2
60
  - **Parameters**: 1.5B
61
  - **Framework**: veRL
 
56
  - **Developed by**: Fang Wu\*, Weihao Xuan\*, Heli Qi\*, Ximing Lu, Aaron Tu, Li Erran Li, Yejin Choi
57
  - **Institutional affiliations**: Stanford University, University of Tokyo, RIKEN AIP, University of Washington, UC Berkeley, Amazon AWS, Columbia University
58
  - **Paper**: [DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search](https://huggingface.co/papers/2509.25454)
59
+ - **Code**: [Github](https://github.com/smiles724/DeepSearch)
60
  - **Base Model**: Nemotron-Research-Reasoning-Qwen-1.5B v2
61
  - **Parameters**: 1.5B
62
  - **Framework**: veRL