| timestamp,task,accuracy,model,max_problems | |
| 2026-01-12T02:44:09.419982,ARC-Easy,0.0,checkpoints/final_135m.pt,5 | |
| 2026-01-12T02:44:09.419982,ARC-Challenge,0.0,checkpoints/final_135m.pt,5 | |
| 2026-01-12T02:44:09.419982,MMLU,0.0,checkpoints/final_135m.pt,5 | |
| 2026-01-12T02:44:09.419982,GSM8K,0.0,checkpoints/final_135m.pt,5 | |
| 2026-01-12T02:44:09.419982,HumanEval,0.0,checkpoints/final_135m.pt,5 | |
| 2026-01-12T02:44:09.419982,SpellingBee,0.0,checkpoints/final_135m.pt,5 | |