Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25, 2025 • 40 • 2 guanning/math Viewer • Updated Jun 12, 2025 • 12.5k • 7 guanning/aime24 Viewer • Updated May 25, 2025 • 30 • 4 guanning/aime25 Viewer • Updated May 25, 2025 • 30 • 2
Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25, 2025 • 40 • 2 guanning/math Viewer • Updated Jun 12, 2025 • 12.5k • 7 guanning/aime24 Viewer • Updated May 25, 2025 • 30 • 4 guanning/aime25 Viewer • Updated May 25, 2025 • 30 • 2