FeatureBench: Benchmarking Agentic Coding for Complex Feature Development Paper โข 2602.10975 โข Published 4 days ago โข 18
WizardLMTeam/WizardLM_evol_instruct_V2_196k Viewer โข Updated Mar 10, 2024 โข 143k โข 1.43k โข 246
Running Featured 583 LLM-Perf Leaderboard ๐ 583 Explore LLM performance across hardware configurations
Running on CPU Upgrade 13.8k Open LLM Leaderboard ๐ 13.8k Track, rank and evaluate open LLMs and chatbots
Running 1.49k Big Code Models Leaderboard ๐ 1.49k Explore and submit code model evaluations on a leaderboard