Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,7 @@ model-index:
|
|
| 22 |
name: AIME 2024
|
| 23 |
type: text
|
| 24 |
metrics:
|
| 25 |
-
- type:
|
| 26 |
value: 53.65
|
| 27 |
- task:
|
| 28 |
name: Mathematical Reasoning
|
|
@@ -31,7 +31,7 @@ model-index:
|
|
| 31 |
name: AIME 2025
|
| 32 |
type: text
|
| 33 |
metrics:
|
| 34 |
-
- type:
|
| 35 |
value: 35.42
|
| 36 |
- task:
|
| 37 |
name: Mathematical Reasoning
|
|
@@ -40,7 +40,7 @@ model-index:
|
|
| 40 |
name: AMC 2023
|
| 41 |
type: text
|
| 42 |
metrics:
|
| 43 |
-
- type:
|
| 44 |
value: 90.39
|
| 45 |
- task:
|
| 46 |
name: Mathematical Reasoning
|
|
@@ -49,7 +49,7 @@ model-index:
|
|
| 49 |
name: MATH500
|
| 50 |
type: text
|
| 51 |
metrics:
|
| 52 |
-
- type:
|
| 53 |
value: 92.53
|
| 54 |
- task:
|
| 55 |
name: Mathematical Reasoning
|
|
@@ -58,7 +58,7 @@ model-index:
|
|
| 58 |
name: Minerva
|
| 59 |
type: text
|
| 60 |
metrics:
|
| 61 |
-
- type:
|
| 62 |
value: 40.00
|
| 63 |
- task:
|
| 64 |
name: Mathematical Reasoning
|
|
@@ -67,7 +67,7 @@ model-index:
|
|
| 67 |
name: Olympiad
|
| 68 |
type: text
|
| 69 |
metrics:
|
| 70 |
-
- type:
|
| 71 |
value: 65.72
|
| 72 |
---
|
| 73 |
<div align="center">
|
|
@@ -77,7 +77,7 @@ model-index:
|
|
| 77 |
**DeepSearch-1.5B🌟** is a 1.5B parameter reasoning model trained with **Reinforcement Learning with Verifiable Rewards (RLVR)**, enhanced by **Monte Carlo Tree Search (MCTS)**.
|
| 78 |
Unlike prior approaches that restrict structured search to inference, DeepSearch integrates MCTS *into training*, enabling systematic exploration, fine-grained credit assignment, and efficient replay buffering.
|
| 79 |
|
| 80 |
-
This model achieves **state-of-the-art accuracy among 1.5B reasoning models** while being **
|
| 81 |
|
| 82 |

|
| 83 |
|
|
|
|
| 22 |
name: AIME 2024
|
| 23 |
type: text
|
| 24 |
metrics:
|
| 25 |
+
- type: avg@32
|
| 26 |
value: 53.65
|
| 27 |
- task:
|
| 28 |
name: Mathematical Reasoning
|
|
|
|
| 31 |
name: AIME 2025
|
| 32 |
type: text
|
| 33 |
metrics:
|
| 34 |
+
- type: avg@32
|
| 35 |
value: 35.42
|
| 36 |
- task:
|
| 37 |
name: Mathematical Reasoning
|
|
|
|
| 40 |
name: AMC 2023
|
| 41 |
type: text
|
| 42 |
metrics:
|
| 43 |
+
- type: avg@32
|
| 44 |
value: 90.39
|
| 45 |
- task:
|
| 46 |
name: Mathematical Reasoning
|
|
|
|
| 49 |
name: MATH500
|
| 50 |
type: text
|
| 51 |
metrics:
|
| 52 |
+
- type: avg@32
|
| 53 |
value: 92.53
|
| 54 |
- task:
|
| 55 |
name: Mathematical Reasoning
|
|
|
|
| 58 |
name: Minerva
|
| 59 |
type: text
|
| 60 |
metrics:
|
| 61 |
+
- type: avg@32
|
| 62 |
value: 40.00
|
| 63 |
- task:
|
| 64 |
name: Mathematical Reasoning
|
|
|
|
| 67 |
name: Olympiad
|
| 68 |
type: text
|
| 69 |
metrics:
|
| 70 |
+
- type: avg@32
|
| 71 |
value: 65.72
|
| 72 |
---
|
| 73 |
<div align="center">
|
|
|
|
| 77 |
**DeepSearch-1.5B🌟** is a 1.5B parameter reasoning model trained with **Reinforcement Learning with Verifiable Rewards (RLVR)**, enhanced by **Monte Carlo Tree Search (MCTS)**.
|
| 78 |
Unlike prior approaches that restrict structured search to inference, DeepSearch integrates MCTS *into training*, enabling systematic exploration, fine-grained credit assignment, and efficient replay buffering.
|
| 79 |
|
| 80 |
+
This model achieves **state-of-the-art accuracy among 1.5B reasoning models** while being **5.7× more compute-efficient** than extended RL training baselines.
|
| 81 |
|
| 82 |

|
| 83 |
|