| | Adding Tasks |
| | |
| |
|
| | This is a tutorial on adding new machine learning tasks using ``lavis.tasks`` module. |
| |
|
| | The LAVIS library includes a standard task module that centralizes the model training and evaluation procedure of machine learning tasks. |
| | The ``lavis.tasks`` module is designed such that any new tasks can be added and integrated, catering to any customization in the training and testing procedures. |
| | In this tutorial, we will replicate the steps to add a new task into LAVIS for the `video-grounded dialogue tasks <https://arxiv.org/pdf/1901.09107.pdf>`_. |
| |
|
| | Base Task ``lavis.tasks.base_task`` |
| | ******************************************************************************** |
| |
|
| | Note that any new model definition should inherit the base task class ``BaseTask``: |
| |
|
| | .. code-block:: python |
| |
|
| | import logging |
| | import os |
| | |
| | import torch.distributed as dist |
| | from lavis.common.dist_utils import get_rank, get_world_size, is_main_process |
| | from lavis.common.logger import MetricLogger, SmoothedValue |
| | from lavis.common.registry import registry |
| | from lavis.datasets.data_utils import prepare_sample |
| | |
| | class BaseTask: |
| | def __init__(self, **kwargs): |
| | super().__init__() |
| | |
| | self.inst_id_key = "instance_id" |
| | |
| | @classmethod |
| | def setup_task(cls, **kwargs): |
| | return cls() |
| | |
| | def build_model(self, cfg): |
| | model_config = cfg.model_cfg |
| | |
| | model_cls = registry.get_model_class(model_config.arch) |
| | return model_cls.from_config(model_config) |
| | |
| | def build_datasets(self, cfg): |
| | """ |
| | Build a dictionary of datasets, keyed by split 'train', 'valid', 'test'. |
| | Download dataset and annotations automatically if not exist. |
| | |
| | Args: |
| | cfg (common.config.Config): _description_ |
| | |
| | Returns: |
| | dict: Dictionary of torch.utils.data.Dataset objects by split. |
| | """ |
| | |
| | datasets = dict() |
| | |
| | datasets_config = cfg.datasets_cfg |
| | |
| | assert len(datasets_config) > 0, "At least one dataset has to be specified." |
| | |
| | for name in datasets_config: |
| | dataset_config = datasets_config[name] |
| | |
| | builder = registry.get_builder_class(name)(dataset_config) |
| | dataset = builder.build_datasets() |
| | |
| | datasets[name] = dataset |
| | |
| | return datasets |
| | |
| | def train_step(self, model, samples): |
| | loss = model(samples)["loss"] |
| | return loss |
| | |
| | ... |
| |
|
| | In this base task, we already declare and standardize many common methods such as ``train_step``, ``build_model``, and ``build_datasets``. |
| | Inheriting this base task class allows us to standardize operations of tasks across all task classes. |
| | We recommend users not change the implementation of the base task class as this will have an impact on all existing task subclasses. |
| |
|
| | Dialogue Task ``lavis.tasks.dialogue`` |
| | ******************************************************************************** |
| |
|
| | In this step, we can define a new task class, e.g. under ``lavis.tasks.dialogue``, for video-grounded dialogues. |
| | For instance, we define a new task class ``DialogueTask`` that inherits the super task class ``BaseTask``. |
| |
|
| | .. code-block:: python |
| |
|
| | import json |
| | import os |
| | |
| | from lavis.common.dist_utils import main_process |
| | from lavis.common.logger import MetricLogger |
| | from lavis.common.registry import registry |
| | from lavis.tasks.base_task import BaseTask |
| | from lavis.datasets.data_utils import prepare_sample |
| | |
| | import numpy as np |
| | |
| | @registry.register_task("dialogue") |
| | class DialogueTask(BaseTask): |
| | def __init__(self, num_beams, max_len, min_len, evaluate, report_metric=True): |
| | super().__init__() |
| | |
| | self.num_beams = num_beams |
| | self.max_len = max_len |
| | self.min_len = min_len |
| | self.evaluate = evaluate |
| | |
| | self.report_metric = report_metric |
| | |
| | @classmethod |
| | def setup_task(cls, cfg): |
| | run_cfg = cfg.run_cfg |
| | |
| | num_beams = run_cfg.num_beams |
| | max_len = run_cfg.max_len |
| | min_len = run_cfg.min_len |
| | evaluate = run_cfg.evaluate |
| | |
| | report_metric = run_cfg.get("report_metric", True) |
| | |
| | return cls( |
| | num_beams=num_beams, |
| | max_len=max_len, |
| | min_len=min_len, |
| | evaluate=evaluate, |
| | report_metric=report_metric, |
| | ) |
| | |
| | def valid_step(self, model, samples): |
| | results = [] |
| | loss = model(samples)["loss"].item() |
| | |
| | return [loss] |
| | ... |
| |
|
| | Note that for any new task, we advise the users to review carefully the functions implemented within ``BaseTask`` and consider which methods should be modified. |
| | For instance, the base task class already contains a standard implementation of model training steps that are common among machine learning steps. |
| | Some major methods we want to emphasize and should be customized by each task are the ``valid_step`` and ``evaluation``. |
| | These operations were not fully implemented in the base task class due to the differences in evaluation procedures among many machine learning tasks. |
| | Another method that should be considered is the ``setup_task`` method. |
| | This method will receive configurations that set task-specific parameters to initialize any task instance. |
| |
|
| | Registering New Task ``lavis.tasks.__init__`` |
| | ******************************************************************************** |
| |
|
| | Any new task must be officially registered as part of the ``lavis.tasks`` module. For instance, to add a new task for video-grounded dialogues, we can modify the ``__init__.py`` as follows: |
| |
|
| | .. code-block:: python |
| |
|
| | from lavis.tasks.dialogue import DialogueTask |
| | |
| | ... |
| | __all__ = [ |
| | ... |
| | "DialogueTask" |
| | ] |
| |
|
| | Assigning Task |
| | *************** |
| |
|
| | From the above example of task class, note that we define a ``setup_task`` method for each task class. |
| | This method will process a configuration file and pass specific parameters e.g. ``num_beams`` (for beam search generative tasks during the inference stage), to initialize the task classes properly. |
| | To assign and associate any task, we need to specify the correct registry of task classes in a configuration file. |
| | For instance, the following should be specified in a configuration file e.g. ``dialogue_avsd_ft.yaml``: |
| |
|
| | .. code-block:: yaml |
| |
|
| | run: |
| | task: dialogue |
| | |
| | |
| | ... |
| | |
| | max_len: 20 |
| | min_len: 5 |
| | num_beams: 3 |
| | ... |
| | |
| | Subsequently, any processes (e.g. training) should load this configuration file to assign the correct task. |
| |
|
| | .. code-block:: sh |
| |
|
| | python train.py --cfg-path dialogue_avsd_ft.yaml |