Upload benchmarks.csv with huggingface_hub

#9
Files changed (1) hide show
  1. benchmarks.csv +7 -0
benchmarks.csv ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ Benchmark,Base %,Distilled %,Std Dev
2
+ AIME 2024,1.5,35.2,0.8
3
+ MATH-500,25.0,89.1,1.2
4
+ GSM8K,65.0,92.8,0.5
5
+ GPQA Diamond,28.0,45.5,1.5
6
+ LiveCodeBench,15.0,32.5,2.1
7
+ HumanEval,55.0,82.3,1.8