RedHatAI
/

granite-3.1-2b-instruct-quantized.w8a8

@@ -195,7 +195,7 @@ evalplus.evaluate \
 #### OpenLLM Leaderboard V1 evaluation scores
-| Metric                                  | ibm-granite/granite-3.1-2b-instruct             | neuralmagic-ent/granite-3.1-2b-instruct-quantized.w4a16 |
 |-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
 | ARC-Challenge (Acc-Norm, 25-shot)       | 55.63                             | 55.12                                       |
 | GSM8K (Strict-Match, 5-shot)            | 60.96                             | 60.58                                       |
@@ -206,8 +206,20 @@ evalplus.evaluate \
 | **Average Score**                       | **61.98**                         | **61.68**                                   |
 | **Recovery**                            | **100.00**                        | **99.51**                                   |
 #### HumanEval pass@1 scores
-| Metric                                  | ibm-granite/granite-3.1-2b-instruct             | neuralmagic-ent/granite-3.1-2b-instruct-quantized.w4a16 |
 |-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
 | HumanEval Pass@1                        | 53.40                             | 0.549                                      |

 #### OpenLLM Leaderboard V1 evaluation scores
+| Metric                                  | ibm-granite/granite-3.1-2b-instruct             | neuralmagic-ent/granite-3.1-2b-instruct-quantized.w8a8 |
 |-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
 | ARC-Challenge (Acc-Norm, 25-shot)       | 55.63                             | 55.12                                       |
 | GSM8K (Strict-Match, 5-shot)            | 60.96                             | 60.58                                       |
 | **Average Score**                       | **61.98**                         | **61.68**                                   |
 | **Recovery**                            | **100.00**                        | **99.51**                                   |
+#### OpenLLM Leaderboard V2 evaluation scores
+| Metric                                  | ibm-granite/granite-3.1-2b-instruct             | neuralmagic-ent/granite-3.1-2b-instruct-quantized.w8a8 |
+|-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
+| IFEval (Inst Level Strict Acc, 0-shot)| 67.99                           | 67.03                                          |
+| BBH (Acc-Norm, 3-shot)            | 44.11                             | 43.53                                        |
+| Math-Hard (Exact-Match, 4-shot)   | 8.66                            | 8.04                                        |
+| GPQA (Acc-Norm, 0-shot)           | 28.30                             | 27.60                                        |
+| MUSR (Acc-Norm, 0-shot)           | 35.12                             | 34.58                                          |
+| MMLU-Pro (Acc, 5-shot)            | 26.87                             |                                         |
+| **Average Score**                 | **35.17**                         | ****                                    |
+| **Recovery**                      | **100.00**                         | ****                                    |
 #### HumanEval pass@1 scores
+| Metric                                  | ibm-granite/granite-3.1-2b-instruct             | neuralmagic-ent/granite-3.1-2b-instruct-quantized.w8a8 |
 |-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
 | HumanEval Pass@1                        | 53.40                             | 0.549                                      |