fangwu97 commited on
Commit
d6e578b
·
verified ·
1 Parent(s): 83983bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -22,7 +22,7 @@ model-index:
22
  name: AIME 2024
23
  type: text
24
  metrics:
25
- - type: pass@1
26
  value: 53.65
27
  - task:
28
  name: Mathematical Reasoning
@@ -31,7 +31,7 @@ model-index:
31
  name: AIME 2025
32
  type: text
33
  metrics:
34
- - type: pass@1
35
  value: 35.42
36
  - task:
37
  name: Mathematical Reasoning
@@ -40,7 +40,7 @@ model-index:
40
  name: AMC 2023
41
  type: text
42
  metrics:
43
- - type: pass@1
44
  value: 90.39
45
  - task:
46
  name: Mathematical Reasoning
@@ -49,7 +49,7 @@ model-index:
49
  name: MATH500
50
  type: text
51
  metrics:
52
- - type: pass@1
53
  value: 92.53
54
  - task:
55
  name: Mathematical Reasoning
@@ -58,7 +58,7 @@ model-index:
58
  name: Minerva
59
  type: text
60
  metrics:
61
- - type: pass@1
62
  value: 40.00
63
  - task:
64
  name: Mathematical Reasoning
@@ -67,7 +67,7 @@ model-index:
67
  name: Olympiad
68
  type: text
69
  metrics:
70
- - type: pass@1
71
  value: 65.72
72
  ---
73
  <div align="center">
@@ -77,7 +77,7 @@ model-index:
77
  **DeepSearch-1.5B🌟** is a 1.5B parameter reasoning model trained with **Reinforcement Learning with Verifiable Rewards (RLVR)**, enhanced by **Monte Carlo Tree Search (MCTS)**.
78
  Unlike prior approaches that restrict structured search to inference, DeepSearch integrates MCTS *into training*, enabling systematic exploration, fine-grained credit assignment, and efficient replay buffering.
79
 
80
- This model achieves **state-of-the-art accuracy among 1.5B reasoning models** while being **72× more compute-efficient** than extended RL training baselines.
81
 
82
  ![Illstration of DeepSearch algorithm](./deepsearch.png)
83
 
 
22
  name: AIME 2024
23
  type: text
24
  metrics:
25
+ - type: avg@32
26
  value: 53.65
27
  - task:
28
  name: Mathematical Reasoning
 
31
  name: AIME 2025
32
  type: text
33
  metrics:
34
+ - type: avg@32
35
  value: 35.42
36
  - task:
37
  name: Mathematical Reasoning
 
40
  name: AMC 2023
41
  type: text
42
  metrics:
43
+ - type: avg@32
44
  value: 90.39
45
  - task:
46
  name: Mathematical Reasoning
 
49
  name: MATH500
50
  type: text
51
  metrics:
52
+ - type: avg@32
53
  value: 92.53
54
  - task:
55
  name: Mathematical Reasoning
 
58
  name: Minerva
59
  type: text
60
  metrics:
61
+ - type: avg@32
62
  value: 40.00
63
  - task:
64
  name: Mathematical Reasoning
 
67
  name: Olympiad
68
  type: text
69
  metrics:
70
+ - type: avg@32
71
  value: 65.72
72
  ---
73
  <div align="center">
 
77
  **DeepSearch-1.5B🌟** is a 1.5B parameter reasoning model trained with **Reinforcement Learning with Verifiable Rewards (RLVR)**, enhanced by **Monte Carlo Tree Search (MCTS)**.
78
  Unlike prior approaches that restrict structured search to inference, DeepSearch integrates MCTS *into training*, enabling systematic exploration, fine-grained credit assignment, and efficient replay buffering.
79
 
80
+ This model achieves **state-of-the-art accuracy among 1.5B reasoning models** while being **5.7× more compute-efficient** than extended RL training baselines.
81
 
82
  ![Illstration of DeepSearch algorithm](./deepsearch.png)
83