Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25, 2025 • 40 • 13 guanning/math Viewer • Updated Jun 12, 2025 • 12.5k • 20 guanning/aime24 Viewer • Updated May 25, 2025 • 30 • 13 guanning/aime25 Viewer • Updated May 25, 2025 • 30 • 10
Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25, 2025 • 40 • 13 guanning/math Viewer • Updated Jun 12, 2025 • 12.5k • 20 guanning/aime24 Viewer • Updated May 25, 2025 • 30 • 13 guanning/aime25 Viewer • Updated May 25, 2025 • 30 • 10