Buckets:
| # Doc[[lighteval.tasks.requests.Doc]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class lighteval.tasks.requests.Doc</name><anchor>lighteval.tasks.requests.Doc</anchor><source>https://github.com/huggingface/lighteval/blob/vr_994/src/lighteval/tasks/requests.py#L44</source><parameters>[{"name": "query", "val": ": str"}, {"name": "choices", "val": ": list"}, {"name": "gold_index", "val": ": typing.Union[int, list[int]]"}, {"name": "instruction", "val": ": str | None = None"}, {"name": "images", "val": ": list['Image'] | None = None"}, {"name": "specific", "val": ": dict | None = None"}, {"name": "unconditioned_query", "val": ": str | None = None"}, {"name": "original_query", "val": ": str | None = None"}, {"name": "id", "val": ": str = ''"}, {"name": "task_name", "val": ": str = ''"}, {"name": "fewshot_samples", "val": ": list = <factory>"}, {"name": "sampling_methods", "val": ": list = <factory>"}, {"name": "fewshot_sorting_class", "val": ": str | None = None"}, {"name": "generation_size", "val": ": int | None = None"}, {"name": "stop_sequences", "val": ": list[str] | None = None"}, {"name": "use_logits", "val": ": bool = False"}, {"name": "num_samples", "val": ": int = 1"}, {"name": "generation_grammar", "val": ": None = None"}]</parameters><paramsdesc>- **query** (str) -- | |
| The main query, prompt, or question to be sent to the model. | |
| - **choices** (list[str]) -- | |
| List of possible answer choices for the query. | |
| For multiple choice tasks, this contains all options (A, B, C, D, etc.). | |
| For generative tasks, this may be empty or contain reference answers. | |
| - **gold_index** (Union[int, list[int]]) -- | |
| Index or indices of the correct answer(s) in the choices list. | |
| For single correct answers,(e.g., 0 for first choice). | |
| For multiple correct answers, use a list (e.g., [0, 2] for first and third). | |
| - **instruction** (str | None) -- | |
| System prompt or task-specific instructions to guide the model. | |
| This is typically prepended to the query to set context or behavior. | |
| - **images** (list["Image"] | None) -- | |
| List of PIL Image objects for multimodal tasks. | |
| - **specific** (dict | None) -- | |
| Task-specific information or metadata. | |
| Can contain any additional data needed for evaluation. | |
| - **unconditioned_query** (Optional[str]) -- | |
| Query without task-specific context for PMI normalization. | |
| Used to calculate: log P(choice | Query) - log P(choice | Unconditioned Query). | |
| - **original_query** (str | None) -- | |
| The query before any preprocessing or modification. | |
| - **#** Set by task parameters -- | |
| - **id** (str) -- | |
| Unique identifier for this evaluation instance. | |
| Set by the task and not the user. | |
| - **task_name** (str) -- | |
| Name of the task or benchmark this Doc belongs to. | |
| - **##** Few-shot Learning Parameters -- | |
| - **fewshot_samples** (list) -- | |
| List of Doc objects representing few-shot examples. | |
| These examples are prepended to the main query to provide context. | |
| - **sampling_methods** (list[SamplingMethod]) -- | |
| List of sampling methods to use for this instance. | |
| Options: GENERATIVE, LOGPROBS, PERPLEXITY. | |
| - **fewshot_sorting_class** (Optional[str]) -- | |
| Class label for balanced few-shot example selection. | |
| Used to ensure diverse representation in few-shot examples. | |
| - **##** Generation Control Parameters -- | |
| - **generation_size** (int | None) -- | |
| Maximum number of tokens to generate for this instance. | |
| - **stop_sequences** (list[str] | None) -- | |
| List of strings that should stop generation when encountered. | |
| **Used for**: Controlled generation, preventing unwanted continuations. | |
| - **use_logits** (bool) -- | |
| Whether to return logits (raw model outputs) in addition to text. | |
| **Used for**: Probability analysis, confidence scoring, detailed evaluation. | |
| - **num_samples** (int) -- | |
| Number of different samples to generate for this instance. | |
| **Used for**: Diversity analysis, uncertainty estimation, ensemble methods. | |
| - **generation_grammar** (None) -- | |
| Grammar constraints for generation (currently not implemented). | |
| **Reserved for**: Future structured generation features.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Dataclass representing a single evaluation sample for a benchmark. | |
| This class encapsulates all the information needed to evaluate a model on a single | |
| task instance. It contains the input query, expected outputs, metadata, and | |
| configuration parameters for different types of evaluation tasks. | |
| **Required Fields:** | |
| - `query`: The input prompt or question | |
| - `choices`: Available answer choices (for multiple choice tasks) | |
| - `gold_index`: Index(es) of the correct answer(s) | |
| **Optional Fields:** | |
| - `instruction`: System prompt, task specific. Will be appended to model specific system prompt. | |
| - `images`: Visual inputs for multimodal tasks. | |
| Methods: | |
| get_golds(): | |
| Returns the correct answer(s) as strings based on gold_index. | |
| Handles both single and multiple correct answers. | |
| Usage Examples: | |
| **Multiple Choice Question:** | |
| <ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example"> | |
| ```python | |
| doc = Doc( | |
| query="What is the capital of France?", | |
| choices=["London", "Paris", "Berlin", "Madrid"], | |
| gold_index=1, # Paris is the correct answer | |
| instruction="Answer the following geography question:", | |
| ) | |
| ``` | |
| </ExampleCodeBlock> | |
| **Generative Task:** | |
| <ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-2"> | |
| ```python | |
| doc = Doc( | |
| query="Write a short story about a robot.", | |
| choices=[], # No predefined choices for generative tasks | |
| gold_index=0, # Not used for generative tasks | |
| generation_size=100, | |
| stop_sequences=[" | |
| End"], | |
| ) | |
| ``` | |
| </ExampleCodeBlock> | |
| **Few-shot Learning:** | |
| <ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-3"> | |
| ```python | |
| doc = Doc( | |
| query="Translate 'Hello world' to Spanish.", | |
| choices=["Hola mundo", "Bonjour monde", "Ciao mondo"], | |
| gold_index=0, | |
| fewshot_samples=[ | |
| Doc(query="Translate 'Good morning' to Spanish.", | |
| choices=["Buenos días", "Bonjour", "Buongiorno"], | |
| gold_index=0), | |
| Doc(query="Translate 'Thank you' to Spanish.", | |
| choices=["Gracias", "Merci", "Grazie"], | |
| gold_index=0) | |
| ], | |
| ) | |
| ``` | |
| </ExampleCodeBlock> | |
| **Multimodal Task:** | |
| <ExampleCodeBlock anchor="lighteval.tasks.requests.Doc.example-4"> | |
| ```python | |
| doc = Doc( | |
| query="What is shown in this image?", | |
| choices=["A cat"], | |
| gold_index=0, | |
| images=[pil_image], # PIL Image object | |
| ) | |
| ``` | |
| </ExampleCodeBlock> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>get_golds</name><anchor>lighteval.tasks.requests.Doc.get_golds</anchor><source>https://github.com/huggingface/lighteval/blob/vr_994/src/lighteval/tasks/requests.py#L217</source><parameters>[]</parameters></docstring> | |
| Return gold targets extracted from the target dict | |
| </div></div> | |
| <EditOnGithub source="https://github.com/huggingface/lighteval/blob/main/docs/source/package_reference/doc.mdx" /> |
Xet Storage Details
- Size:
- 7.04 kB
- Xet hash:
- ee66cc3cce27f76d8f1554e128b5faa9c030aa9c25bb41ec8b512f8840a9ac86
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.