Update README.md
Browse files
README.md
CHANGED
|
@@ -27,7 +27,7 @@ For a deeper look into the implementation details, refer to the our repository:
|
|
| 27 |
|
| 28 |
We used the [EvalScope](https://github.com/modelscope/evalscope) to evaluate models and report Pass@1 accuracy across all benchmarks. The number of responses generated per problem is as follows:
|
| 29 |
- 64 responses: `AMC23, AIME24, AIME25, Vietnamese-Entrance-Math-Exam`
|
| 30 |
-
- 8 responses: `Minerva-Math, Math-
|
| 31 |
- 4 responses: `Math500, Olympiad-Bench`
|
| 32 |
- 1 responses: `IFEval`
|
| 33 |
|
|
|
|
| 27 |
|
| 28 |
We used the [EvalScope](https://github.com/modelscope/evalscope) to evaluate models and report Pass@1 accuracy across all benchmarks. The number of responses generated per problem is as follows:
|
| 29 |
- 64 responses: `AMC23, AIME24, AIME25, Vietnamese-Entrance-Math-Exam`
|
| 30 |
+
- 8 responses: `Minerva-Math, Math-Gaokao-2023-English`
|
| 31 |
- 4 responses: `Math500, Olympiad-Bench`
|
| 32 |
- 1 responses: `IFEval`
|
| 33 |
|