rvienne/layton-eval
Viewer • Updated • 1.01k • 10
All layton-eval related datasets
Note Dataset containing layton-eval riddles
Note Dataset containing everything to compute PPI-based benchmark score
Note Benchmark final results on several frontier models