Buckets:

rtrm's picture
|
download
raw
27.5 kB
# Tasks
## LightevalTask
### LightevalTaskConfig[[lighteval.tasks.lighteval_task.LightevalTaskConfig]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.tasks.lighteval_task.LightevalTaskConfig</name><anchor>lighteval.tasks.lighteval_task.LightevalTaskConfig</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L47</source><parameters>[{"name": "name", "val": ": str"}, {"name": "prompt_function", "val": ": typing.Callable[[dict, str], lighteval.tasks.requests.Doc]"}, {"name": "hf_repo", "val": ": str"}, {"name": "hf_subset", "val": ": str"}, {"name": "metrics", "val": ": list[lighteval.metrics.utils.metric_utils.Metric] | tuple[lighteval.metrics.utils.metric_utils.Metric, ...]"}, {"name": "hf_revision", "val": ": str | None = None"}, {"name": "hf_filter", "val": ": typing.Optional[typing.Callable[[dict], bool]] = None"}, {"name": "hf_avail_splits", "val": ": list[str] | tuple[str, ...] = <factory>"}, {"name": "evaluation_splits", "val": ": list[str] | tuple[str, ...] = <factory>"}, {"name": "few_shots_split", "val": ": str | None = None"}, {"name": "few_shots_select", "val": ": str | None = None"}, {"name": "generation_size", "val": ": int | None = None"}, {"name": "generation_grammar", "val": ": huggingface_hub.inference._generated.types.text_generation.TextGenerationInputGrammarType | None = None"}, {"name": "stop_sequence", "val": ": list[str] | tuple[str, ...] | None = None"}, {"name": "num_samples", "val": ": list[int] | None = None"}, {"name": "suite", "val": ": list[str] | tuple[str, ...] = <factory>"}, {"name": "original_num_docs", "val": ": int = -1"}, {"name": "effective_num_docs", "val": ": int = -1"}, {"name": "must_remove_duplicate_docs", "val": ": bool = False"}, {"name": "num_fewshots", "val": ": int = 0"}, {"name": "version", "val": ": int = 0"}]</parameters><paramsdesc>- **name** (str) -- Short name of the evaluation task.
- **prompt_function** (Callable[[dict, str], Doc]) -- Function that converts dataset
row to Doc objects for evaluation. Takes a dataset row dict and task
name as input.
- **hf_repo** (str) -- HuggingFace Hub repository path containing the evaluation dataset.
- **hf_subset** (str) -- Dataset subset/configuration name to use for this task.
- **metrics** (ListLike[Metric]) -- List of metrics to compute for this task.</paramsdesc><paramgroups>0</paramgroups></docstring>
Configuration dataclass for a LightevalTask.
This class stores all the configuration parameters needed to define and run
an evaluation task, including dataset information, prompt formatting,
evaluation metrics, and generation parameters.
Dataset Configuration:
hf_revision (str | None, optional): Specific dataset revision to use.
Defaults to None (latest).
hf_filter (Callable[[dict], bool] | None, optional): Filter function to
apply to dataset items. Defaults to None.
hf_avail_splits (ListLike[str], optional): Available dataset splits.
Defaults to ["train", "validation", "test"].
Evaluation Splits:
evaluation_splits (ListLike[str], optional): Dataset splits to use for
evaluation. Defaults to ["validation"].
few_shots_split (str | None, optional): Split to sample few-shot examples
from. Defaults to None.
few_shots_select (str | None, optional): Method for selecting few-shot
examples. Defaults to None.
Generation Parameters:
generation_size (int | None, optional): Maximum token length for generated
text. Defaults to None.
generation_grammar (TextGenerationInputGrammarType | None, optional): Grammar
for structured text generation. Only available for TGI and Inference
Endpoint models. Defaults to None.
stop_sequence (ListLike[str] | None, optional): Sequences that stop text
generation. Defaults to None.
num_samples (list[int] | None, optional): Number of samples to generate
per input. Defaults to None.
Task Configuration:
suite (ListLike[str], optional): Evaluation suites this task belongs to.
Defaults to ["custom"].
version (int, optional): Task version number. Increment when dataset or
prompt changes. Defaults to 0.
num_fewshots (int, optional): Number of few-shot examples to include.
Defaults to 0.
truncate_fewshots (bool, optional): Whether to truncate few-shot examples.
Defaults to False.
must_remove_duplicate_docs (bool, optional): Whether to remove duplicate
documents. Defaults to False.
Document Tracking:
original_num_docs (int, optional): Total number of documents in the task.
Defaults to -1.
effective_num_docs (int, optional): Number of documents actually used
in evaluation. Defaults to -1.
</div>
### LightevalTask[[lighteval.tasks.lighteval_task.LightevalTask]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.tasks.lighteval_task.LightevalTask</name><anchor>lighteval.tasks.lighteval_task.LightevalTask</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L193</source><parameters>[{"name": "config", "val": ": LightevalTaskConfig"}]</parameters></docstring>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>aggregation</name><anchor>lighteval.tasks.lighteval_task.LightevalTask.aggregation</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L397</source><parameters>[]</parameters></docstring>
Return a dict with metric name and its aggregation function for all
metrics
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>download_dataset_worker</name><anchor>lighteval.tasks.lighteval_task.LightevalTask.download_dataset_worker</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L429</source><parameters>[{"name": "task", "val": ": LightevalTask"}]</parameters><paramsdesc>- **task** (LightevalTask) -- The task object containing dataset configuration.</paramsdesc><paramgroups>0</paramgroups><rettype>DatasetDict</rettype><retdesc>The loaded dataset dictionary containing all splits.</retdesc></docstring>
Worker function to download a dataset from the HuggingFace Hub.
Downloads the dataset specified in the task configuration, optionally
applies a filter if configured, and returns the dataset dictionary.
This method is designed to be used for parallel dataset loading.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>eval_docs</name><anchor>lighteval.tasks.lighteval_task.LightevalTask.eval_docs</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L340</source><parameters>[]</parameters><rettype>list[Doc]</rettype><retdesc>Evaluation documents.</retdesc></docstring>
Returns the evaluation documents.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>fewshot_docs</name><anchor>lighteval.tasks.lighteval_task.LightevalTask.fewshot_docs</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L321</source><parameters>[]</parameters><rettype>list[Doc]</rettype><retdesc>Documents that will be used for few shot examples. One
document = one few shot example.</retdesc></docstring>
Returns the few shot documents. If the few shot documents are not
available, it gets them from the few shot split or the evaluation split.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>get_docs</name><anchor>lighteval.tasks.lighteval_task.LightevalTask.get_docs</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L352</source><parameters>[{"name": "max_samples", "val": ": int | None = None"}]</parameters><paramsdesc>- **max_samples** (int | None, optional) -- Maximum number of documents to return.
If None, returns all available documents. Defaults to None.</paramsdesc><paramgroups>0</paramgroups><rettype>list[Doc]</rettype><retdesc>List of documents ready for evaluation with few-shot examples
and generation parameters configured.</retdesc><raises>- ``ValueError`` -- If no documents are available for evaluation.</raises><raisederrors>``ValueError``</raisederrors></docstring>
Get evaluation documents with few-shot examples and generation parameters configured.
Retrieves evaluation documents, optionally limits the number of samples,
shuffles them for reproducibility, and configures each document with
few-shot examples and generation parameters for evaluation.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>get_first_possible_fewshot_splits</name><anchor>lighteval.tasks.lighteval_task.LightevalTask.get_first_possible_fewshot_splits</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L247</source><parameters>[{"name": "available_splits", "val": ": list[str] | tuple[str, ...]"}]</parameters><rettype>str</rettype><retdesc>the first available fewshot splits or None if nothing is available</retdesc></docstring>
Parses the possible fewshot split keys in order: train, then validation
keys and matches them with the available keys. Returns the first
available.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>load_datasets</name><anchor>lighteval.tasks.lighteval_task.LightevalTask.load_datasets</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/lighteval_task.py#L406</source><parameters>[{"name": "tasks", "val": ": dict"}, {"name": "dataset_loading_processes", "val": ": int = 1"}]</parameters><paramsdesc>- **tasks** (dict[str, LightevalTask]) -- Dictionary mapping task names to task objects.
- **dataset_loading_processes** (int, optional) -- Number of processes to use for
parallel dataset loading. Defaults to 1 (sequential loading).</paramsdesc><paramgroups>0</paramgroups></docstring>
Load datasets from the HuggingFace Hub for the given tasks.
</div></div>
## PromptManager[[lighteval.tasks.prompt_manager.PromptManager]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.tasks.prompt_manager.PromptManager</name><anchor>lighteval.tasks.prompt_manager.PromptManager</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/prompt_manager.py#L42</source><parameters>[{"name": "use_chat_template", "val": ": bool = False"}, {"name": "tokenizer", "val": " = None"}, {"name": "system_prompt", "val": ": str | None = None"}]</parameters></docstring>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>prepare_prompt</name><anchor>lighteval.tasks.prompt_manager.PromptManager.prepare_prompt</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/prompt_manager.py#L48</source><parameters>[{"name": "doc", "val": ": Doc"}]</parameters><rettype>str</rettype><retdesc>The formatted prompt string</retdesc></docstring>
Prepare a prompt from a document, either using chat template or plain text format.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>prepare_prompt_api</name><anchor>lighteval.tasks.prompt_manager.PromptManager.prepare_prompt_api</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/prompt_manager.py#L88</source><parameters>[{"name": "doc", "val": ": Doc"}]</parameters><rettype>list[dict[str, str]]</rettype><retdesc>List of message dictionaries for API calls</retdesc></docstring>
Prepare a prompt for API calls, using a chat-like format.
Will not tokenize the message because APIs will usually handle this.
</div></div>
## Registry[[lighteval.tasks.registry.Registry]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.tasks.registry.Registry</name><anchor>lighteval.tasks.registry.Registry</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/registry.py#L111</source><parameters>[{"name": "tasks", "val": ": str | pathlib.Path | None = None"}, {"name": "custom_tasks", "val": ": str | pathlib.Path | module | None = None"}, {"name": "load_community", "val": ": bool = False"}, {"name": "load_extended", "val": ": bool = False"}, {"name": "load_multilingual", "val": ": bool = False"}]</parameters></docstring>
The Registry class is used to manage the task registry and get task classes.
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>create_custom_tasks_module</name><anchor>lighteval.tasks.registry.Registry.create_custom_tasks_module</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/registry.py#L378</source><parameters>[{"name": "custom_tasks", "val": ": str | pathlib.Path | module"}]</parameters><paramsdesc>- **custom_tasks** (Optional[Union[str, ModuleType]]) -- Path to the custom tasks file or name of a module to import containing custom tasks or the module itself</paramsdesc><paramgroups>0</paramgroups><rettype>ModuleType</rettype><retdesc>The newly imported/created custom tasks modules</retdesc></docstring>
Creates a custom task module to load tasks defined by the user in their own file.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>create_task_config_dict</name><anchor>lighteval.tasks.registry.Registry.create_task_config_dict</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/registry.py#L403</source><parameters>[{"name": "meta_table", "val": ": list[lighteval.tasks.lighteval_task.LightevalTaskConfig] | None = None"}]</parameters><paramsdesc>- **meta_table** -- meta_table containing tasks
configurations. If not provided, it will be loaded from TABLE_PATH.</paramsdesc><paramgroups>0</paramgroups><rettype>Dict[str, LightevalTaskConfig]</rettype><retdesc>A dictionary of task names mapped to their corresponding LightevalTaskConfig.</retdesc></docstring>
Create configuration tasks based on the provided meta_table.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>print_all_tasks</name><anchor>lighteval.tasks.registry.Registry.print_all_tasks</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/registry.py#L425</source><parameters>[{"name": "suites", "val": ": str | None = None"}]</parameters><paramsdesc>- **suites** -- Comma-separated list of suites to display. If None, shows core suites only.
Use 'all' to show all available suites (core + optional).
Special handling for 'multilingual' suite with dependency checking.</paramsdesc><paramgroups>0</paramgroups></docstring>
Print all the tasks in the task registry.
</div></div>
## Doc[[lighteval.tasks.requests.Doc]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.tasks.requests.Doc</name><anchor>lighteval.tasks.requests.Doc</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/requests.py#L44</source><parameters>[{"name": "query", "val": ": str"}, {"name": "choices", "val": ": list"}, {"name": "gold_index", "val": ": typing.Union[int, list[int]]"}, {"name": "instruction", "val": ": str | None = None"}, {"name": "images", "val": ": list['Image'] | None = None"}, {"name": "specific", "val": ": dict | None = None"}, {"name": "unconditioned_query", "val": ": str | None = None"}, {"name": "original_query", "val": ": str | None = None"}, {"name": "id", "val": ": str = ''"}, {"name": "task_name", "val": ": str = ''"}, {"name": "fewshot_samples", "val": ": list = <factory>"}, {"name": "sampling_methods", "val": ": list = <factory>"}, {"name": "fewshot_sorting_class", "val": ": str | None = None"}, {"name": "generation_size", "val": ": int | None = None"}, {"name": "stop_sequences", "val": ": list[str] | None = None"}, {"name": "use_logits", "val": ": bool = False"}, {"name": "num_samples", "val": ": int = 1"}, {"name": "generation_grammar", "val": ": None = None"}]</parameters><paramsdesc>- **query** (str) --
The main query, prompt, or question to be sent to the model.
- **choices** (list[str]) --
List of possible answer choices for the query.
For multiple choice tasks, this contains all options (A, B, C, D, etc.).
For generative tasks, this may be empty or contain reference answers.
- **gold_index** (Union[int, list[int]]) --
Index or indices of the correct answer(s) in the choices list.
For single correct answers,(e.g., 0 for first choice).
For multiple correct answers, use a list (e.g., [0, 2] for first and third).
- **instruction** (str | None) --
System prompt or task-specific instructions to guide the model.
This is typically prepended to the query to set context or behavior.
- **images** (list["Image"] | None) --
List of PIL Image objects for multimodal tasks.
- **specific** (dict | None) --
Task-specific information or metadata.
Can contain any additional data needed for evaluation.
- **unconditioned_query** (Optional[str]) --
Query without task-specific context for PMI normalization.
Used to calculate: log P(choice | Query) - log P(choice | Unconditioned Query).
- **original_query** (str | None) --
The query before any preprocessing or modification.
- **#** Set by task parameters --
- **id** (str) --
Unique identifier for this evaluation instance.
Set by the task and not the user.
- **task_name** (str) --
Name of the task or benchmark this Doc belongs to.
- **##** Few-shot Learning Parameters --
- **fewshot_samples** (list) --
List of Doc objects representing few-shot examples.
These examples are prepended to the main query to provide context.
- **sampling_methods** (list[SamplingMethod]) --
List of sampling methods to use for this instance.
Options: GENERATIVE, LOGPROBS, PERPLEXITY.
- **fewshot_sorting_class** (Optional[str]) --
Class label for balanced few-shot example selection.
Used to ensure diverse representation in few-shot examples.
- **##** Generation Control Parameters --
- **generation_size** (int | None) --
Maximum number of tokens to generate for this instance.
- **stop_sequences** (list[str] | None) --
List of strings that should stop generation when encountered.
**Used for**: Controlled generation, preventing unwanted continuations.
- **use_logits** (bool) --
Whether to return logits (raw model outputs) in addition to text.
**Used for**: Probability analysis, confidence scoring, detailed evaluation.
- **num_samples** (int) --
Number of different samples to generate for this instance.
**Used for**: Diversity analysis, uncertainty estimation, ensemble methods.
- **generation_grammar** (None) --
Grammar constraints for generation (currently not implemented).
**Reserved for**: Future structured generation features.</paramsdesc><paramgroups>0</paramgroups></docstring>
Dataclass representing a single evaluation sample for a benchmark.
This class encapsulates all the information needed to evaluate a model on a single
task instance. It contains the input query, expected outputs, metadata, and
configuration parameters for different types of evaluation tasks.
**Required Fields:**
- `query`: The input prompt or question
- `choices`: Available answer choices (for multiple choice tasks)
- `gold_index`: Index(es) of the correct answer(s)
**Optional Fields:**
- `instruction`: System prompt, task specific. Will be appended to model specific system prompt.
- `images`: Visual inputs for multimodal tasks.
Methods:
get_golds():
Returns the correct answer(s) as strings based on gold_index.
Handles both single and multiple correct answers.
Usage Examples:
**Multiple Choice Question:**
<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example">
```python
doc = Doc(
query="What is the capital of France?",
choices=["London", "Paris", "Berlin", "Madrid"],
gold_index=1, # Paris is the correct answer
instruction="Answer the following geography question:",
)
```
</ExampleCodeBlock>
**Generative Task:**
<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-2">
```python
doc = Doc(
query="Write a short story about a robot.",
choices=[], # No predefined choices for generative tasks
gold_index=0, # Not used for generative tasks
generation_size=100,
stop_sequences=["
End"],
)
```
</ExampleCodeBlock>
**Few-shot Learning:**
<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-3">
```python
doc = Doc(
query="Translate 'Hello world' to Spanish.",
choices=["Hola mundo", "Bonjour monde", "Ciao mondo"],
gold_index=0,
fewshot_samples=[
Doc(query="Translate 'Good morning' to Spanish.",
choices=["Buenos días", "Bonjour", "Buongiorno"],
gold_index=0),
Doc(query="Translate 'Thank you' to Spanish.",
choices=["Gracias", "Merci", "Grazie"],
gold_index=0)
],
)
```
</ExampleCodeBlock>
**Multimodal Task:**
<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-4">
```python
doc = Doc(
query="What is shown in this image?",
choices=["A cat"],
gold_index=0,
images=[pil_image], # PIL Image object
)
```
</ExampleCodeBlock>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>get_golds</name><anchor>lighteval.tasks.requests.Doc.get_golds</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/tasks/requests.py#L217</source><parameters>[]</parameters></docstring>
Return gold targets extracted from the target dict
</div></div>
## Datasets[[lighteval.data.DynamicBatchDataset]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.data.DynamicBatchDataset</name><anchor>lighteval.data.DynamicBatchDataset</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L44</source><parameters>[{"name": "requests", "val": ": list"}, {"name": "num_dataset_splits", "val": ": int"}]</parameters></docstring>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>get_original_order</name><anchor>lighteval.data.DynamicBatchDataset.get_original_order</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L88</source><parameters>[{"name": "new_arr", "val": ": list"}]</parameters><paramsdesc>- **new_arr** (list) -- Array containing any kind of data that needs to be
reset in the original order.</paramsdesc><paramgroups>0</paramgroups><rettype>list</rettype><retdesc>new_arr in the original order.</retdesc></docstring>
Get the original order of the data.
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>splits_iterator</name><anchor>lighteval.data.DynamicBatchDataset.splits_iterator</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L110</source><parameters>[]</parameters><yieldtype>Subset</yieldtype><yielddesc>A subset of the dataset.</yielddesc></docstring>
Iterator that yields the dataset splits based on the split limits.
</div></div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.data.LoglikelihoodDataset</name><anchor>lighteval.data.LoglikelihoodDataset</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L161</source><parameters>[{"name": "requests", "val": ": list"}, {"name": "num_dataset_splits", "val": ": int"}]</parameters></docstring>
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.data.GenerativeTaskDataset</name><anchor>lighteval.data.GenerativeTaskDataset</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L186</source><parameters>[{"name": "requests", "val": ": list"}, {"name": "num_dataset_splits", "val": ": int"}]</parameters></docstring>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>init_split_limits</name><anchor>lighteval.data.GenerativeTaskDataset.init_split_limits</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L187</source><parameters>[{"name": "num_dataset_splits", "val": ""}]</parameters><paramsdesc>- **num_dataset_splits** (_type_) -- _description_</paramsdesc><paramgroups>0</paramgroups><rettype>_type_</rettype><retdesc>_description_</retdesc></docstring>
Initialises the split limits based on generation parameters.
The splits are used to estimate time remaining when evaluating, and in the case of generative evaluations, to group similar samples together.
For generative tasks, self._sorting_criteria outputs:
- a boolean (whether the generation task uses logits)
- a list (the stop sequences)
- the item length (the actual size sorting factor).
In the current function, we create evaluation groups by generation parameters (logits and eos), so that samples with similar properties get batched together afterwards.
The samples will then be further organised by length in each split.
</div></div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.data.GenerativeTaskDatasetNanotron</name><anchor>lighteval.data.GenerativeTaskDatasetNanotron</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L254</source><parameters>[{"name": "requests", "val": ": list"}, {"name": "num_dataset_splits", "val": ": int"}]</parameters></docstring>
</div>
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class lighteval.data.GenDistributedSampler</name><anchor>lighteval.data.GenDistributedSampler</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/data.py#L270</source><parameters>[{"name": "dataset", "val": ": Dataset"}, {"name": "num_replicas", "val": ": typing.Optional[int] = None"}, {"name": "rank", "val": ": typing.Optional[int] = None"}, {"name": "shuffle", "val": ": bool = True"}, {"name": "seed", "val": ": int = 0"}, {"name": "drop_last", "val": ": bool = False"}]</parameters></docstring>
A distributed sampler that copy the last element only when drop_last is False so we keep a small padding in the batches
as our samples are sorted by length.
</div>
<EditOnGithub source="https://github.com/huggingface/lighteval/blob/main/docs/source/package_reference/tasks.mdx" />

Xet Storage Details

Size:
27.5 kB
·
Xet hash:
246bb69bc1922c09619105c37b600986837b5d0a5a7ab09522f9c7e792e52c47

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.