dragon-dbh / benchmark_results.csv
ThomasTheMaker's picture
Upload folder using huggingface_hub
efc7a23 verified
timestamp,task,accuracy,model,max_problems
2026-01-12T02:44:09.419982,ARC-Easy,0.0,checkpoints/final_135m.pt,5
2026-01-12T02:44:09.419982,ARC-Challenge,0.0,checkpoints/final_135m.pt,5
2026-01-12T02:44:09.419982,MMLU,0.0,checkpoints/final_135m.pt,5
2026-01-12T02:44:09.419982,GSM8K,0.0,checkpoints/final_135m.pt,5
2026-01-12T02:44:09.419982,HumanEval,0.0,checkpoints/final_135m.pt,5
2026-01-12T02:44:09.419982,SpellingBee,0.0,checkpoints/final_135m.pt,5