coding-benchmarking dataset data-sets for benchmarking LLM for software devt livebench/liveswebench Viewer • Updated Mar 31, 2025 • 53 • 18 • 1 livebench/liveswebench-patches Viewer • Updated Mar 31, 2025 • 1 • 51 livebench/reasoning Viewer • Updated Apr 7, 2025 • 200 • 4.49k • 15 livebench/data_analysis Viewer • Updated Apr 7, 2025 • 150 • 3.75k • 5
coding-benchmarking dataset data-sets for benchmarking LLM for software devt livebench/liveswebench Viewer • Updated Mar 31, 2025 • 53 • 18 • 1 livebench/liveswebench-patches Viewer • Updated Mar 31, 2025 • 1 • 51 livebench/reasoning Viewer • Updated Apr 7, 2025 • 200 • 4.49k • 15 livebench/data_analysis Viewer • Updated Apr 7, 2025 • 150 • 3.75k • 5