Update README.md
Browse files
README.md
CHANGED
|
@@ -55,7 +55,6 @@ The **`Skywork-OR1`** (Open Reasoner 1) model series consists of powerful math a
|
|
| 55 |
<img src="./assets/7b_perf.png" width="75%"/>
|
| 56 |
</div>
|
| 57 |
</div>
|
| 58 |
-
<br>
|
| 59 |
|
| 60 |
We evaluate our models on AIME24, AIME25, and LiveCodeBench. Instead of using Pass@1, which is common in prior work, we introduce Avg@K as the primary metric. This metric robustly measures a model's average performance across K independent attempts, reducing the impact of randomness and enhancing the reliability of the results. We believe that Avg@K provides a better reflection of a model's stability and reasoning consistency.
|
| 61 |
|
|
|
|
| 55 |
<img src="./assets/7b_perf.png" width="75%"/>
|
| 56 |
</div>
|
| 57 |
</div>
|
|
|
|
| 58 |
|
| 59 |
We evaluate our models on AIME24, AIME25, and LiveCodeBench. Instead of using Pass@1, which is common in prior work, we introduce Avg@K as the primary metric. This metric robustly measures a model's average performance across K independent attempts, reducing the impact of randomness and enhancing the reliability of the results. We believe that Avg@K provides a better reflection of a model's stability and reasoning consistency.
|
| 60 |
|