Buckets:
Lighteval
๐ค Lighteval is your all-in-one toolkit for evaluating Large Language Models (LLMs) across multiple backends with ease. Dive deep into your model's performance by saving and exploring detailed, sample-by-sample results to debug and see how your models stack up.
Key Features
๐ Multi-Backend Support
Evaluate your models using the most popular and efficient inference backends:
transformers: Evaluate models on CPU or one or more GPUs using ๐ค Acceleratenanotron: Evaluate models in distributed settings using โก๏ธ Nanotronvllm: Evaluate models on one or more GPUs using ๐ VLLMcustom: Evaluate custom models (can be anything)sglang: Evaluate models using SGLang as backendinference-endpoint: Evaluate models using Hugging Face's Inference Endpoints APItgi: Evaluate models using ๐ Text Generation Inference running locallylitellm: Evaluate models on any compatible API using LiteLLMinference-providers: Evaluate models using HuggingFace's inference providers as backend**: Distributed training and evaluation
๐ Comprehensive Evaluation
- Extensive Task Library: 1000s pre-built evaluation tasks
- Custom Task Creation: Build your own evaluation tasks
- Flexible Metrics: Support for custom metrics and scoring
- Detailed Analysis: Sample-by-sample results for deep insights
๐ง Easy Customization
Customization at your fingertips: create new tasks, metrics or model tailored to your needs, or browse all our existing tasks and metrics.
โ๏ธ Seamless Integration
Seamlessly experiment, benchmark, and store your results on the Hugging Face Hub, S3, or locally.
Quick Start
Installation
pip install lighteval
Basic Usage
# Evaluate a model using Transformers backend
lighteval accelerate \
"model_name=openai-community/gpt2" \
"leaderboard|truthfulqa:mc|0"
Save Results
# Save locally
lighteval accelerate \
"model_name=openai-community/gpt2" \
"leaderboard|truthfulqa:mc|0" \
--output-dir ./results
# Push to Hugging Face Hub
lighteval accelerate \
"model_name=openai-community/gpt2" \
"leaderboard|truthfulqa:mc|0" \
--push-to-hub \
--results-org your-username
Xet Storage Details
- Size:
- 2.85 kB
- Xet hash:
- da35927464f73b2c0abed49ba5d1223ed580750eb0061422ed9ad7518571f100
ยท
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.