Running Agents 231 BigCodeBench Leaderboard 🥇 231 Explore code-generation model leaderboards and task details
Running Agents 95 Nexus Function Calling Leaderboard 🐠 95 Display benchmark results for models on various tasks