Running 230 BigCodeBench Leaderboard π₯ 230 Explore code-generation model leaderboards and task details
Running on CPU Upgrade 591 GAIA Leaderboard π¦Ύ 591 Submit your model answers to GAIA benchmark and view leaderboard
Running Featured 560 Vision Arena (Testing VLMs side-by-side) πΌ 560 Analyze images with multiple vision models for labels and boxes
Running 232 AI2 WildBench Leaderboard (V2) π¦ 232 Display and explore a leaderboard of language models