Update README.md
Browse files
README.md
CHANGED
|
@@ -102,7 +102,7 @@ The model was evaluated on reasoning tasks including AIME24, MMLU_COT, and GSM8K
|
|
| 102 |
|
| 103 |
### Reproduction
|
| 104 |
|
| 105 |
-
The results of AIME24 and MMLU_COT were obtained using [SGLang](https://docs.sglang.ai/) while result of GSM8K
|
| 106 |
|
| 107 |
### AIME24
|
| 108 |
```
|
|
|
|
| 102 |
|
| 103 |
### Reproduction
|
| 104 |
|
| 105 |
+
The results of AIME24 and MMLU_COT were obtained using [SGLang](https://docs.sglang.ai/) while result of GSM8K was obtained using [vLLM](https://docs.vllm.ai/en/latest/). All the evaluations were conducted via forked [lm-evaluation-harness](https://github.com/BowenBao/lm-evaluation-harness/tree/cot).
|
| 106 |
|
| 107 |
### AIME24
|
| 108 |
```
|