Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / lighteval /pr_1221 /en /index.md

HuggingFaceDocBuilder

14 days ago

preview code

download

raw

3.18 kB

Lighteval

🤗 Lighteval is your all-in-one toolkit for evaluating Large Language Models (LLMs) across multiple backends with ease. Dive deep into your model's performance by saving and exploring detailed, sample-by-sample results to debug and see how your models stack up.

Share your evaluation results with the community by pushing them to the Hugging Face Hub. If you open Pull Requests on model repositories with evaluation results, we will automatically show the results on benchmark dataset repositories. Let's decentralize evaluation! Check out the docs.

Key Features

🚀 Multi-Backend Support

Evaluate your models using the most popular and efficient inference backends:

eval: Use inspect-ai as backend to evaluate and inspect your models! (prefered way)
transformers: Evaluate models on CPU or one or more GPUs using 🤗 Accelerate
nanotron: Evaluate models in distributed settings using ⚡️ Nanotron
vllm: Evaluate models on one or more GPUs using 🚀 VLLM
custom: Evaluate custom models (can be anything)
sglang: Evaluate models using SGLang as backend
inference-endpoint: Evaluate models using Hugging Face's Inference Endpoints API
tgi: Evaluate models using 🔗 Text Generation Inference running locally
litellm: Evaluate models on any compatible API using LiteLLM
inference-providers: Evaluate models using HuggingFace's inference providers as backend**: Distributed training and evaluation

📊 Comprehensive Evaluation

Extensive Task Library: 1000s pre-built evaluation tasks
Custom Task Creation: Build your own evaluation tasks
Flexible Metrics: Support for custom metrics and scoring
Detailed Analysis: Sample-by-sample results for deep insights

🔧 Easy Customization

Customization at your fingertips: create new tasks, metrics or model tailored to your needs, or browse all our existing tasks and metrics.

☁️ Seamless Integration

Seamlessly experiment, benchmark, and store your results on the Hugging Face Hub, S3, or locally.

Quick Start

Installation

pip install lighteval

Basic Usage

Find a task

Xet Storage Details

Size:: 3.18 kB
Xet hash:: 6739edf11e653c1b9e3103045f9fbfdcbf331bbe4e50d2506f39c6b0cfd84960

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.