Toolkits (DUPLICATE them, never use the public ones) A set of tools to enable finetuning, evaluations, prototyping, agentic workflows etc. ATTENTION: ALWAYS DUPLICATE THESE SPACES ON OUR INFRA!!! Running 128 AutoTrain Advanced ๐ 128 Create powerful AI models without code Runtime error Agents 40 LLM Merge Adapter ๐ข 40 Runtime error Agents Featured 290 mergekit-gui ๐ 290 Merge AI models using a YAML configuration file
Runtime error Agents Featured 290 mergekit-gui ๐ 290 Merge AI models using a YAML configuration file
Benchmarks Most commonly used leaderboards to check model capabilities Running on CPU Upgrade 14k Open LLM Leaderboard ๐ 14k Track, rank and evaluate open LLMs and chatbots Running Featured 464 LLM Performance Leaderboard ๐จ 464 View the LLM leaderboard rankings Running 4.93k Arena Leaderboard ๐ 4.93k View the LMArena leaderboard in fullโscreen Running on CPU Upgrade 7.5k MTEB Leaderboard ๐ 7.5k Embedding Leaderboard
Running on CPU Upgrade 14k Open LLM Leaderboard ๐ 14k Track, rank and evaluate open LLMs and chatbots
Toolkits (DUPLICATE them, never use the public ones) A set of tools to enable finetuning, evaluations, prototyping, agentic workflows etc. ATTENTION: ALWAYS DUPLICATE THESE SPACES ON OUR INFRA!!! Running 128 AutoTrain Advanced ๐ 128 Create powerful AI models without code Runtime error Agents 40 LLM Merge Adapter ๐ข 40 Runtime error Agents Featured 290 mergekit-gui ๐ 290 Merge AI models using a YAML configuration file
Runtime error Agents Featured 290 mergekit-gui ๐ 290 Merge AI models using a YAML configuration file
Benchmarks Most commonly used leaderboards to check model capabilities Running on CPU Upgrade 14k Open LLM Leaderboard ๐ 14k Track, rank and evaluate open LLMs and chatbots Running Featured 464 LLM Performance Leaderboard ๐จ 464 View the LLM leaderboard rankings Running 4.93k Arena Leaderboard ๐ 4.93k View the LMArena leaderboard in fullโscreen Running on CPU Upgrade 7.5k MTEB Leaderboard ๐ 7.5k Embedding Leaderboard
Running on CPU Upgrade 14k Open LLM Leaderboard ๐ 14k Track, rank and evaluate open LLMs and chatbots