Spaces:
Running
Running
| title: README | |
| emoji: ⚖️ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: static | |
| pinned: false | |
| Hi! Welcome on the org page of the Evaluation team at HuggingFace. | |
| We want to support the community in building and sharing quality evaluations, for reproducible and fair model comparisions, to cut through the hype of releases and better understand actual model capabilities. | |
| We're behind the: | |
| - [evaluation guidebook](https://huggingface.co/spaces/OpenEvals/evaluation-guidebook), your reference for LLM evals | |
| - [lighteval](https://github.com/huggingface/lighteval) LLM evaluation suite, fast and filled with the SOTA benchmarks you might want | |
| - [leaderboards on the hub](https://huggingface.co/blog?tag=leaderboard) initiative, to encourage people to build more leaderboards in the open for more reproducible evaluation. You'll find some doc [here](https://huggingface.co/docs/leaderboards/index) to build your own, and you can look for the best leaderboard for your use case [here](https://huggingface.co/spaces/OpenEvals/find-a-leaderboard)! | |
| Our archived projects: | |
| - [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/) (over 11K models evaluated since 2023) | |
| We're not behind the [evaluate metrics guide](https://huggingface.co/evaluate-metric) but if you want to understand metrics better we really recommend checking it out! |