Update README.md
Browse files
README.md
CHANGED
|
@@ -145,3 +145,18 @@ This model was evaluated on the well-known text benchmarks using [lm-evaluation-
|
|
| 145 |
|
| 146 |
|
| 147 |
### Accuracy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
|
| 147 |
### Accuracy
|
| 148 |
+
|
| 149 |
+
|
| 150 |
+
| Benchmark | sarvamai/sarvam-30b | RedHatAI/sarvam-30b-FP8-Dynamic | Recovery (%) |
|
| 151 |
+
|---|---|---|---|
|
| 152 |
+
| BBH (exact_match) | 63.32 | 62.95 | 99.42% |
|
| 153 |
+
| GSM8K (strict-match) | 72.33 | 72.40 | 100.10% |
|
| 154 |
+
| GSM8K (flexible-extract) | 69.67 | 70.81 | 101.63% |
|
| 155 |
+
| IFEval (inst_level_strict_acc) | 34.17 | 31.65 | 92.63% |
|
| 156 |
+
| MMLU-Pro (exact_match) | 45.69 | 45.81 | 100.25% |
|
| 157 |
+
| ARC-Challenge (acc) | 58.28 | 57.76 | 99.12% |
|
| 158 |
+
| HellaSwag (acc) | 53.98 | 53.98 | 100.00% |
|
| 159 |
+
| MMLU (acc) | 66.20 | 66.15 | 99.92% |
|
| 160 |
+
| TruthfulQA MC2 (acc) | 50.34 | 50.58 | 100.48% |
|
| 161 |
+
| Winogrande (acc) | 61.09 | 61.17 | 100.13% |
|
| 162 |
+
|