krishnateja95 commited on
Commit
27f1436
·
verified ·
1 Parent(s): 0a7399a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -1
README.md CHANGED
@@ -166,4 +166,123 @@ The model was evaluated on the OpenLLMv1 leaderboard task, using [lm-evaluation-
166
  ```
167
 
168
 
169
- </details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
166
  ```
167
 
168
 
169
+ </details>
170
+
171
+
172
+
173
+
174
+
175
+
176
+
177
+
178
+ ### Accuracy
179
+
180
+ <table>
181
+ <thead>
182
+ <tr>
183
+ <th>Category</th>
184
+ <th>Metric</th>
185
+ <th>meta-llama/Llama-3.1-70B-Instruct</th>
186
+ <th>nm-testing/Llama-3.1-70B-Instruct-FP8-block</th>
187
+ <th>Recovery (%)</th>
188
+ </tr>
189
+ </thead>
190
+ <tbody>
191
+ <!-- OpenLLM Leaderboard V1 -->
192
+ <tr>
193
+ <td rowspan="7"><b>OpenLLM V1</b></td>
194
+ <td>ARC-Challenge (Acc-Norm, 25-shot)</td>
195
+ <td>abc</td>
196
+ <td>ijk</td>
197
+ <td>xyz</td>
198
+ </tr>
199
+ <tr>
200
+ <td>GSM8K (Strict-Match, 5-shot)</td>
201
+ <td>abc</td>
202
+ <td>ijk</td>
203
+ <td>xyz</td>
204
+ </tr>
205
+ <tr>
206
+ <td>HellaSwag (Acc-Norm, 10-shot)</td>
207
+ <td>abc</td>
208
+ <td>ijk</td>
209
+ <td>xyz</td>
210
+ </tr>
211
+ <tr>
212
+ <td>MMLU (Acc, 5-shot)</td>
213
+ <td>abc</td>
214
+ <td>ijk</td>
215
+ <td>xyz</td>
216
+ </tr>
217
+ <tr>
218
+ <td>TruthfulQA (MC2, 0-shot)</td>
219
+ <td>abc</td>
220
+ <td>ijk</td>
221
+ <td>xyz</td>
222
+ </tr>
223
+ <tr>
224
+ <td>Winogrande (Acc, 5-shot)</td>
225
+ <td>abc</td>
226
+ <td>ijk</td>
227
+ <td>xyz</td>
228
+ </tr>
229
+ <tr>
230
+ <td><b>Average Score</b></td>
231
+ <td><b>abc</b></td>
232
+ <td><b>ijk</b></td>
233
+ <td><b>xyz</b></td>
234
+ </tr>
235
+ <!-- OpenLLM Leaderboard V2 -->
236
+ <tr>
237
+ <td rowspan="7"><b>OpenLLM V2</b></td>
238
+ <td>IFEval (Inst Level Strict Acc, 0-shot)</td>
239
+ <td>abc</td>
240
+ <td>ijk</td>
241
+ <td>xyz</td>
242
+ </tr>
243
+ <tr>
244
+ <td>BBH (Acc-Norm, 3-shot)</td>
245
+ <td>abc</td>
246
+ <td>ijk</td>
247
+ <td>xyz</td>
248
+ </tr>
249
+ <tr>
250
+ <td>Math-Hard (Exact-Match, 4-shot)</td>
251
+ <td>abc</td>
252
+ <td>ijk</td>
253
+ <td>xyz</td>
254
+ </tr>
255
+ <tr>
256
+ <td>GPQA (Acc-Norm, 0-shot)</td>
257
+ <td>abc</td>
258
+ <td>ijk</td>
259
+ <td>xyz</td>
260
+ </tr>
261
+ <tr>
262
+ <td>MUSR (Acc-Norm, 0-shot)</td>
263
+ <td>abc</td>
264
+ <td>ijk</td>
265
+ <td>xyz</td>
266
+ </tr>
267
+ <tr>
268
+ <td>MMLU-Pro (Acc, 5-shot)</td>
269
+ <td>abc</td>
270
+ <td>ijk</td>
271
+ <td>xyz</td>
272
+ </tr>
273
+ <tr>
274
+ <td><b>Average Score</b></td>
275
+ <td><b>abc</b></td>
276
+ <td><b>ijk</b></td>
277
+ <td><b>xyz</b></td>
278
+ </tr>
279
+ <!-- HumanEval -->
280
+ <tr>
281
+ <td rowspan="2"><b>Coding</b></td>
282
+ <td>HumanEval Pass@1</td>
283
+ <td>abc</td>
284
+ <td>ijk</td>
285
+ <td><b>xyz</b></td>
286
+ </tr>
287
+ </tbody>
288
+ </table>