Paused Agents 30 Open LLM Leaderboard for domains π 30 Ranking for Open-sourced LLMs in different domains
Running on CPU Upgrade Agents 251 MMLU-Pro Leaderboard π₯ 251 More advanced and challenging multi-task evaluation