Buckets:
| # Logging | |
| ## EvaluationTracker[[lighteval.logging.evaluation_tracker.EvaluationTracker]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class lighteval.logging.evaluation_tracker.EvaluationTracker</name><anchor>lighteval.logging.evaluation_tracker.EvaluationTracker</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/evaluation_tracker.py#L95</source><parameters>[{"name": "output_dir", "val": ": str"}, {"name": "results_path_template", "val": ": str | None = None"}, {"name": "save_details", "val": ": bool = True"}, {"name": "push_to_hub", "val": ": bool = False"}, {"name": "push_to_tensorboard", "val": ": bool = False"}, {"name": "hub_results_org", "val": ": str | None = ''"}, {"name": "tensorboard_metric_prefix", "val": ": str = 'eval'"}, {"name": "public", "val": ": bool = False"}, {"name": "nanotron_run_info", "val": ": GeneralArgs = None"}, {"name": "use_wandb", "val": ": bool = False"}]</parameters><paramsdesc>- **output_dir** (str) -- Local directory to save evaluation results and logs | |
| - **results_path_template** (str, optional) -- Template for results directory structure. | |
| Example: "{output_dir}/results/{org}_{model}" | |
| - **save_details** (bool, defaults to True) -- Whether to save detailed evaluation records | |
| - **push_to_hub** (bool, defaults to False) -- Whether to push results to HF Hub | |
| - **push_to_tensorboard** (bool, defaults to False) -- Whether to push metrics to TensorBoard | |
| - **hub_results_org** (str, optional) -- HF Hub organization to push results to | |
| - **tensorboard_metric_prefix** (str, defaults to "eval") -- Prefix for TensorBoard metrics | |
| - **public** (bool, defaults to False) -- Whether to make Hub datasets public | |
| - **nanotron_run_info** (GeneralArgs, optional) -- Nanotron model run information | |
| - **use_wandb** (bool, defaults to False) -- Whether to log to Weights & Biases or Trackio if available</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Tracks and manages evaluation results, metrics, and logging for model evaluations. | |
| The EvaluationTracker coordinates multiple specialized loggers to track different aspects of model evaluation: | |
| - Details Logger (DetailsLogger): Records per-sample evaluation details and predictions | |
| - Metrics Logger (MetricsLogger): Tracks aggregate evaluation metrics and scores | |
| - Versions Logger (VersionsLogger): Records task and dataset versions | |
| - General Config Logger (GeneralConfigLogger): Stores overall evaluation configuration | |
| - Task Config Logger (TaskConfigLogger): Maintains per-task configuration details | |
| The tracker can save results locally and optionally push them to: | |
| - Hugging Face Hub as datasets | |
| - TensorBoard for visualization | |
| - Trackio or Weights & Biases for experiment tracking | |
| <ExampleCodeBlock anchor="lighteval.logging.evaluation_tracker.EvaluationTracker.example"> | |
| Example: | |
| ```python | |
| tracker = EvaluationTracker( | |
| output_dir="./eval_results", | |
| push_to_hub=True, | |
| hub_results_org="my-org", | |
| save_details=True | |
| ) | |
| # Log evaluation results | |
| tracker.metrics_logger.add_metric("accuracy", 0.85) | |
| tracker.details_logger.add_detail(task_name="qa", prediction="Paris") | |
| # Save all results | |
| tracker.save() | |
| ``` | |
| </ExampleCodeBlock> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>generate_final_dict</name><anchor>lighteval.logging.evaluation_tracker.EvaluationTracker.generate_final_dict</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/evaluation_tracker.py#L363</source><parameters>[]</parameters><rettype>dict</rettype><retdesc>Dictionary containing all experiment information including config, results, versions, and summaries</retdesc></docstring> | |
| Aggregates and returns all the logger's experiment information in a dictionary. | |
| This function should be used to gather and display said information at the end of an evaluation run. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>push_to_hub</name><anchor>lighteval.logging.evaluation_tracker.EvaluationTracker.push_to_hub</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/evaluation_tracker.py#L387</source><parameters>[{"name": "date_id", "val": ": str"}, {"name": "details", "val": ": dict"}, {"name": "results_dict", "val": ": dict"}]</parameters></docstring> | |
| Pushes the experiment details (all the model predictions for every step) to the hub. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>recreate_metadata_card</name><anchor>lighteval.logging.evaluation_tracker.EvaluationTracker.recreate_metadata_card</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/evaluation_tracker.py#L454</source><parameters>[{"name": "repo_id", "val": ": str"}]</parameters><paramsdesc>- **repo_id** (str) -- Details dataset repository path on the hub (`org/dataset`)</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Fully updates the details repository metadata card for the currently evaluated model | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>save</name><anchor>lighteval.logging.evaluation_tracker.EvaluationTracker.save</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/evaluation_tracker.py#L247</source><parameters>[]</parameters></docstring> | |
| Saves the experiment information and results to files, and to the hub if requested. | |
| </div></div> | |
| ## GeneralConfigLogger[[lighteval.logging.info_loggers.GeneralConfigLogger]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class lighteval.logging.info_loggers.GeneralConfigLogger</name><anchor>lighteval.logging.info_loggers.GeneralConfigLogger</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L48</source><parameters>[]</parameters><paramsdesc>- **lighteval_sha** (str) -- Git commit SHA of lighteval used for evaluation, enabling exact version reproducibility. | |
| Set to "?" if not in a git repository. | |
| - **num_fewshot_seeds** (int) -- Number of random seeds used for few-shot example sampling. | |
| - If <= 1: Single evaluation with seed=0 | |
| - If > 1: Multiple evaluations with different few-shot samplings (HELM-style) | |
| - **max_samples** (int, optional) -- Maximum number of samples to evaluate per task. | |
| Only used for debugging - truncates each task's dataset. | |
| - **job_id** (int, optional) -- Slurm job ID if running on a cluster. | |
| Used to cross-reference with scheduler logs. | |
| - **start_time** (float) -- Unix timestamp when evaluation started. | |
| Automatically set during logger initialization. | |
| - **end_time** (float) -- Unix timestamp when evaluation completed. | |
| Set by calling log_end_time(). | |
| - **total_evaluation_time_secondes** (str) -- Total runtime in seconds. | |
| Calculated as end_time - start_time. | |
| - **model_config** (ModelConfig) -- Complete model configuration settings. | |
| Contains model architecture, tokenizer, and generation parameters. | |
| - **model_name** (str) -- Name identifier for the evaluated model. | |
| Extracted from model_config.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Tracks general configuration and runtime information for model evaluations. | |
| This logger captures key configuration parameters, model details, and timing information | |
| to ensure reproducibility and provide insights into the evaluation process. | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>log_args_info</name><anchor>lighteval.logging.info_loggers.GeneralConfigLogger.log_args_info</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L106</source><parameters>[{"name": "num_fewshot_seeds", "val": ": int"}, {"name": "max_samples", "val": ": int | None"}, {"name": "job_id", "val": ": str"}]</parameters><paramsdesc>- **num_fewshot_seeds** (int) -- number of few-shot seeds. | |
| - **max_samples** (int | None) -- maximum number of samples, if None, use all the samples available. | |
| - **job_id** (str) -- job ID, used to retrieve logs.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Logs the information about the arguments passed to the method. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>log_model_info</name><anchor>lighteval.logging.info_loggers.GeneralConfigLogger.log_model_info</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L123</source><parameters>[{"name": "model_config", "val": ": ModelConfig"}]</parameters><paramsdesc>- **model_config** -- the model config used to initialize the model.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Logs the model information. | |
| </div></div> | |
| ## DetailsLogger[[lighteval.logging.info_loggers.DetailsLogger]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class lighteval.logging.info_loggers.DetailsLogger</name><anchor>lighteval.logging.info_loggers.DetailsLogger</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L138</source><parameters>[{"name": "hashes", "val": ": dict = <factory>"}, {"name": "compiled_hashes", "val": ": dict = <factory>"}, {"name": "details", "val": ": dict = <factory>"}, {"name": "compiled_details", "val": ": dict = <factory>"}, {"name": "compiled_details_over_all_tasks", "val": ": DetailsLogger.CompiledDetailOverAllTasks = <factory>"}]</parameters><paramsdesc>- **hashes** (dict[str, list`Hash`) -- Maps each task name to the list of all its samples' `Hash`. | |
| - **compiled_hashes** (dict[str, CompiledHash) -- Maps each task name to its `CompiledHas`, an aggregation of all the individual sample hashes. | |
| - **details** (dict[str, list`Detail`]) -- Maps each task name to the list of its samples' details. | |
| Example: winogrande: [sample1_details, sample2_details, ...] | |
| - **compiled_details** (dict[str, `CompiledDetail`]) -- : Maps each task name to the list of its samples' compiled details. | |
| - **compiled_details_over_all_tasks** (CompiledDetailOverAllTasks) -- Aggregated details over all the tasks.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Logger for the experiment details. | |
| Stores and logs experiment information both at the task and at the sample level. | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>aggregate</name><anchor>lighteval.logging.info_loggers.DetailsLogger.aggregate</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L277</source><parameters>[]</parameters></docstring> | |
| Hashes the details for each task and then for all tasks. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>log</name><anchor>lighteval.logging.info_loggers.DetailsLogger.log</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L253</source><parameters>[{"name": "task_name", "val": ": str"}, {"name": "doc", "val": ": Doc"}, {"name": "model_response", "val": ": ModelResponse"}, {"name": "metrics", "val": ": dict"}]</parameters><paramsdesc>- **task_name** (str) -- Name of the current task of interest. | |
| - **doc** (Doc) -- Current sample that we want to store. | |
| - **model_response** (ModelResponse) -- Model outputs for the current sample | |
| - **metrics** (dict) -- Model scores for said sample on the current task's metrics.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Stores the relevant information for one sample of one task to the total list of samples stored in the DetailsLogger. | |
| </div></div> | |
| ## MetricsLogger[[lighteval.logging.info_loggers.MetricsLogger]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class lighteval.logging.info_loggers.MetricsLogger</name><anchor>lighteval.logging.info_loggers.MetricsLogger</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L309</source><parameters>[{"name": "metrics_values", "val": ": dict = <factory>"}, {"name": "metric_aggregated", "val": ": dict = <factory>"}]</parameters><paramsdesc>- **metrics_value** (dict[str, dict[str, list[float]]]) -- Maps each task to its dictionary of metrics to scores for all the example of the task. | |
| Example: {"winogrande|winogrande_xl": {"accuracy": [0.5, 0.5, 0.5, 0.5, 0.5, 0.5]}} | |
| - **metric_aggregated** (dict[str, dict[str, float]]) -- Maps each task to its dictionary of metrics to aggregated scores over all the example of the task. | |
| Example: {"winogrande|winogrande_xl": {"accuracy": 0.5}}</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Logs the actual scores for each metric of each task. | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>aggregate</name><anchor>lighteval.logging.info_loggers.MetricsLogger.aggregate</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L330</source><parameters>[{"name": "task_dict", "val": ": dict"}, {"name": "bootstrap_iters", "val": ": int = 1000"}]</parameters><paramsdesc>- **task_dict** (dict[str, LightevalTask]) -- used to determine what aggregation function to use for each metric | |
| - **bootstrap_iters** (int, optional) -- Number of runs used to run the statistical bootstrap. Defaults to 1000.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Aggregate the metrics for each task and then for all tasks. | |
| </div></div> | |
| ## VersionsLogger[[lighteval.logging.info_loggers.VersionsLogger]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class lighteval.logging.info_loggers.VersionsLogger</name><anchor>lighteval.logging.info_loggers.VersionsLogger</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L406</source><parameters>[{"name": "versions", "val": ": dict = <factory>"}]</parameters><paramsdesc>- **version** (dict[str, int]) -- Maps the task names with the task versions.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Logger of the tasks versions. | |
| Tasks can have a version number/date, which indicates what is the precise metric definition and dataset version used for an evaluation. | |
| </div> | |
| ## TaskConfigLogger[[lighteval.logging.info_loggers.TaskConfigLogger]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class lighteval.logging.info_loggers.TaskConfigLogger</name><anchor>lighteval.logging.info_loggers.TaskConfigLogger</anchor><source>https://github.com/huggingface/lighteval/blob/vr_980/src/lighteval/logging/info_loggers.py#L425</source><parameters>[{"name": "tasks_configs", "val": ": dict = <factory>"}]</parameters><paramsdesc>- **tasks_config** (dict[str, LightevalTaskConfig]) -- Maps each task to its associated `LightevalTaskConfig`</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Logs the different parameters of the current `LightevalTask` of interest. | |
| </div> | |
| <EditOnGithub source="https://github.com/huggingface/lighteval/blob/main/docs/source/package_reference/logging.mdx" /> |
Xet Storage Details
- Size:
- 15.7 kB
- Xet hash:
- 0f43cdaf24f913646e3163038f6e8862d6f69c6c3c3fb19d3c4b025fe6ac3f2e
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.