Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / lighteval /pr_1221 /en /adding-a-custom-task.md

HuggingFaceDocBuilder

6 days ago

preview code

download

raw

4.99 kB

	# Adding a Custom Task

	Lighteval provides a flexible framework for creating custom evaluation tasks. This guide explains how to create and integrate new tasks into the evaluation system.

	## Step-by-Step Creation of a Task

	> [!WARNING]
	> To contribute your task to the Lighteval repository, you would first need
	> to install the required dev dependencies by running `pip install -e .[dev]`
	> and then run `pre-commit install` to install the pre-commit hooks.

	### Step 1: Create the Task File

	First, create a Python file or directory under the `src/lighteval/tasks/tasks` directory.
	A directory is helpfull if you need to split your file into multiple ones, just make sure to have one of the file named `main.py`.

	### Step 2: Define the Prompt Function

	You need to define a prompt function that will convert a line from your
	dataset to a document to be used for evaluation.

	```python
	from lighteval.tasks.requests import Doc

	# Define as many as you need for your different tasks
	def prompt_fn(line: dict, task_name: str):
	"""Defines how to go from a dataset line to a doc object.
	Follow examples in src/lighteval/tasks/default_prompts.py, or get more info
	about what this function should do in the README.
	"""
	return Doc(
	task_name=task_name,
	query=line["question"],
	choices=[f" {c}" for c in line["choices"]],
	gold_index=line["gold"],
	)
	```

	### Step 3: Choose or Create Metrics

	You can either use an existing metric (defined in `lighteval.metrics.metrics.Metrics`) or [create a custom one](adding-a-new-metric).

	#### Using Existing Metrics

	```python
	from lighteval.metrics import Metrics

	# Use an existing metric
	metric = Metrics.ACCURACY
	```

	#### Creating Custom Metrics

	```python
	from lighteval.metrics.utils.metric_utils import SampleLevelMetric
	import numpy as np

	custom_metric = SampleLevelMetric(
	metric_name="my_custom_metric_name",
	higher_is_better=True,
	category="accuracy",
	sample_level_fn=lambda x: x, # How to compute score for one sample
	corpus_level_fn=np.mean, # How to aggregate the sample metrics
	)
	```

	### Step 4: Define Your Task

	You can define a task with or without subsets using [LightevalTaskConfig](/docs/lighteval/pr_1221/en/package_reference/tasks#lighteval.tasks.lighteval_task.LightevalTaskConfig).

	#### Simple Task (No Subsets)

	```python
	from lighteval.tasks.lighteval_task import LightevalTaskConfig

	# This is how you create a simple task (like HellaSwag) which has one single subset
	# attached to it, and one evaluation possible.
	task = LightevalTaskConfig(
	name="myothertask",
	prompt_function=prompt_fn, # Must be defined in the file or imported
	hf_repo="your_dataset_repo_on_hf",
	hf_subset="default",
	hf_avail_splits=["train", "test"],
	evaluation_splits=["test"],
	few_shots_split="train",
	few_shots_select="random_sampling_from_train",
	metrics=[metric], # Select your metric in Metrics
	generation_size=256,
	stop_sequence=["\n", "Question:"],
	)
	```

	#### Task with Multiple Subsets

	If you want to create a task with multiple subsets, add them to the
	`SAMPLE_SUBSETS` list and create a task for each subset.

	```python
	SAMPLE_SUBSETS = ["subset1", "subset2", "subset3"] # List of all the subsets to use for this eval

	class CustomSubsetTask(LightevalTaskConfig):
	def __init__(
	self,
	name,
	hf_subset,
	):
	super().__init__(
	name=name,
	hf_subset=hf_subset,
	prompt_function=prompt_fn, # Must be defined in the file or imported
	hf_repo="your_dataset_name",
	metrics=[custom_metric], # Select your metric in Metrics or use your custom_metric
	hf_avail_splits=["train", "test"],
	evaluation_splits=["test"],
	few_shots_split="train",
	few_shots_select="random_sampling_from_train",
	generation_size=256,
	stop_sequence=["\n", "Question:"],
	)

	SUBSET_TASKS = [CustomSubsetTask(name=f"task:{subset}", hf_subset=subset) for subset in SAMPLE_SUBSETS]
	```

	### Step 5: Add Tasks to the Table

	Then you need to add your task to the `TASKS_TABLE` list.

	```python
	# STORE YOUR EVALS

	# Tasks with subsets:
	TASKS_TABLE = SUBSET_TASKS

	# Tasks without subsets:
	# TASKS_TABLE = [task]
	```

	### Step 6: Creating a requirement file

	If your task has requirements, you need to create a `requirement.txt` file with
	only the required dependencies so that anyone can run your task.

	## Running Your Custom Task

	Once your file is created, you can run the evaluation with the following command:

	```bash
	lighteval accelerate \
	"model_name=HuggingFaceH4/zephyr-7b-beta" \
	{task} \
	--custom-tasks {path_to_your_custom_task_file}
	```

	### Example Usage

	```bash
	# Run a custom task with 3 shot evaluation
	lighteval accelerate \
	"model_name=openai-community/gpt2" \
	"myothertask\|3" \
	--custom-tasks community_tasks/my_custom_task.py
	```

Xet Storage Details

Size:: 4.99 kB
Xet hash:: 885fa086fb3b07447f29aefcb4fd6fb6a1e2124766f0b14a06ff37945a89795e

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.