Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / lighteval /pr_994 /en /package_reference /doc.md

rtrm

29 days ago

preview code

download

raw

7.04 kB

	# Doc[[lighteval.tasks.requests.Doc]]

	<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">


	<docstring><name>class lighteval.tasks.requests.Doc</name><anchor>lighteval.tasks.requests.Doc</anchor><source>https://github.com/huggingface/lighteval/blob/vr_994/src/lighteval/tasks/requests.py#L44</source><parameters>[{"name": "query", "val": ": str"}, {"name": "choices", "val": ": list"}, {"name": "gold_index", "val": ": typing.Union[int, list[int]]"}, {"name": "instruction", "val": ": str \| None = None"}, {"name": "images", "val": ": list['Image'] \| None = None"}, {"name": "specific", "val": ": dict \| None = None"}, {"name": "unconditioned_query", "val": ": str \| None = None"}, {"name": "original_query", "val": ": str \| None = None"}, {"name": "id", "val": ": str = ''"}, {"name": "task_name", "val": ": str = ''"}, {"name": "fewshot_samples", "val": ": list = <factory>"}, {"name": "sampling_methods", "val": ": list = <factory>"}, {"name": "fewshot_sorting_class", "val": ": str \| None = None"}, {"name": "generation_size", "val": ": int \| None = None"}, {"name": "stop_sequences", "val": ": list[str] \| None = None"}, {"name": "use_logits", "val": ": bool = False"}, {"name": "num_samples", "val": ": int = 1"}, {"name": "generation_grammar", "val": ": None = None"}]</parameters><paramsdesc>- query (str) --
	The main query, prompt, or question to be sent to the model.

	- choices (list[str]) --
	List of possible answer choices for the query.
	For multiple choice tasks, this contains all options (A, B, C, D, etc.).
	For generative tasks, this may be empty or contain reference answers.

	- gold_index (Union[int, list[int]]) --
	Index or indices of the correct answer(s) in the choices list.
	For single correct answers,(e.g., 0 for first choice).
	For multiple correct answers, use a list (e.g., [0, 2] for first and third).

	- instruction (str \| None) --
	System prompt or task-specific instructions to guide the model.
	This is typically prepended to the query to set context or behavior.

	- images (list["Image"] \| None) --
	List of PIL Image objects for multimodal tasks.

	- specific (dict \| None) --
	Task-specific information or metadata.
	Can contain any additional data needed for evaluation.

	- unconditioned_query (Optional[str]) --
	Query without task-specific context for PMI normalization.
	Used to calculate: log P(choice \| Query) - log P(choice \| Unconditioned Query).

	- original_query (str \| None) --
	The query before any preprocessing or modification.

	- # Set by task parameters --
	- id (str) --
	Unique identifier for this evaluation instance.
	Set by the task and not the user.

	- task_name (str) --
	Name of the task or benchmark this Doc belongs to.

	- ## Few-shot Learning Parameters --
	- fewshot_samples (list) --
	List of Doc objects representing few-shot examples.
	These examples are prepended to the main query to provide context.

	- sampling_methods (list[SamplingMethod]) --
	List of sampling methods to use for this instance.
	Options: GENERATIVE, LOGPROBS, PERPLEXITY.

	- fewshot_sorting_class (Optional[str]) --
	Class label for balanced few-shot example selection.
	Used to ensure diverse representation in few-shot examples.

	- ## Generation Control Parameters --
	- generation_size (int \| None) --
	Maximum number of tokens to generate for this instance.

	- stop_sequences (list[str] \| None) --
	List of strings that should stop generation when encountered.
	Used for: Controlled generation, preventing unwanted continuations.

	- use_logits (bool) --
	Whether to return logits (raw model outputs) in addition to text.
	Used for: Probability analysis, confidence scoring, detailed evaluation.

	- num_samples (int) --
	Number of different samples to generate for this instance.
	Used for: Diversity analysis, uncertainty estimation, ensemble methods.

	- generation_grammar (None) --
	Grammar constraints for generation (currently not implemented).
	Reserved for: Future structured generation features.</paramsdesc><paramgroups>0</paramgroups></docstring>
	Dataclass representing a single evaluation sample for a benchmark.

	This class encapsulates all the information needed to evaluate a model on a single
	task instance. It contains the input query, expected outputs, metadata, and
	configuration parameters for different types of evaluation tasks.

	Required Fields:
	- `query`: The input prompt or question
	- `choices`: Available answer choices (for multiple choice tasks)
	- `gold_index`: Index(es) of the correct answer(s)

	Optional Fields:
	- `instruction`: System prompt, task specific. Will be appended to model specific system prompt.
	- `images`: Visual inputs for multimodal tasks.



	Methods:
	get_golds():
	Returns the correct answer(s) as strings based on gold_index.
	Handles both single and multiple correct answers.

	Usage Examples:

	Multiple Choice Question:
	<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example">

	```python
	doc = Doc(
	query="What is the capital of France?",
	choices=["London", "Paris", "Berlin", "Madrid"],
	gold_index=1, # Paris is the correct answer
	instruction="Answer the following geography question:",
	)
	```

	</ExampleCodeBlock>

	Generative Task:
	<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-2">

	```python
	doc = Doc(
	query="Write a short story about a robot.",
	choices=[], # No predefined choices for generative tasks
	gold_index=0, # Not used for generative tasks
	generation_size=100,
	stop_sequences=["

	End"],
	)
	```

	</ExampleCodeBlock>

	Few-shot Learning:
	<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-3">

	```python
	doc = Doc(
	query="Translate 'Hello world' to Spanish.",
	choices=["Hola mundo", "Bonjour monde", "Ciao mondo"],
	gold_index=0,
	fewshot_samples=[
	Doc(query="Translate 'Good morning' to Spanish.",
	choices=["Buenos días", "Bonjour", "Buongiorno"],
	gold_index=0),
	Doc(query="Translate 'Thank you' to Spanish.",
	choices=["Gracias", "Merci", "Grazie"],
	gold_index=0)
	],
	)
	```

	</ExampleCodeBlock>

	Multimodal Task:
	<ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-4">

	```python
	doc = Doc(
	query="What is shown in this image?",
	choices=["A cat"],
	gold_index=0,
	images=[pil_image], # PIL Image object
	)
	```

	</ExampleCodeBlock>



	<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">


	<docstring><name>get_golds</name><anchor>lighteval.tasks.requests.Doc.get_golds</anchor><source>https://github.com/huggingface/lighteval/blob/vr_994/src/lighteval/tasks/requests.py#L217</source><parameters>[]</parameters></docstring>
	Return gold targets extracted from the target dict

	</div></div>

	<EditOnGithub source="https://github.com/huggingface/lighteval/blob/main/docs/source/package_reference/doc.mdx" />

Xet Storage Details

Size:: 7.04 kB
Xet hash:: ee66cc3cce27f76d8f1554e128b5faa9c030aa9c25bb41ec8b512f8840a9ac86

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.