| | Image Augmentation |
| | ================== |
| |
|
| | Image Augmentation is a data augmentation method that generates more training data |
| | from the existing training samples. Image Augmentation is especially useful in domains |
| | where training data is limited or expensive to obtain like in biomedical applications. |
| |
|
| | .. image:: https://github.com/kornia/data/raw/main/girona_aug.png |
| | :align: center |
| |
|
| | Learn more: `https://paperswithcode.com/task/image-registration <https://paperswithcode.com/task/image-augmentation>`_ |
| |
|
| | Kornia Augmentations |
| | -------------------- |
| |
|
| | Kornia leverages differentiable and GPU image data augmentation through the module `kornia.augmentation <https://kornia.readthedocs.io/en/latest/augmentation.html>`_ |
| | by implementing the functionality to be easily used with `torch.nn.Sequential <https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html?highlight=sequential#torch.nn.Sequential>`_ |
| | and other advanced containers such as |
| | :py:class:`~kornia.augmentation.container.AugmentationSequential`, |
| | :py:class:`~kornia.augmentation.container.ImageSequential`, |
| | :py:class:`~kornia.augmentation.container.PatchSequential` and |
| | :py:class:`~kornia.augmentation.container.VideoSequential`. |
| |
|
| | Our augmentations package is highly inspired by torchvision augmentation APIs while our intention is to not replace it. |
| | Kornia is a library that aligns better to OpenCV functionalities enforcing floating operators to guarantees a better precision |
| | without any float -> uint8 conversions plus on device acceleration. |
| |
|
| | However, we provide the following guide to migrate kornia <-> torchvision. Please, checkout the `Colab: Kornia Playground <https://colab.research.google.com/drive/1T20UNAG4SdlE2n2wstuhiewve5Q81VpS#revisionId=0B4unZG1uMc-WR3NVeTBDcmRwN0NxcGNNVlUwUldPMVprb1dJPQ>`_. |
| |
|
| | .. code-block:: python |
| |
|
| | import kornia.augmentation as K |
| | import torch.nn as nn |
| |
|
| | transform = nn.Sequential( |
| | K.RandomAffine(360), |
| | K.ColorJitter(0.2, 0.3, 0.2, 0.3) |
| | ) |
| |
|
| |
|
| | Best Practices 1: Image Augmentation |
| | ++++++++++++++++++++++++++++++++++++ |
| |
|
| | Kornia augmentations provides simple on-device augmentation framework with the support of various syntax sugars |
| | (e.g. return transformation matrix, inverse geometric transform). Therefore, we provide advanced augmentation |
| | container :py:class:`~kornia.augmentation.container.AugmentationSequential` to ease the pain of building augmenation pipelines. This API would also provide predefined routines |
| | for automating the processing of masks, bounding boxes, and keypoints. |
| |
|
| | .. code-block:: python |
| |
|
| | import kornia.augmentation as K |
| |
|
| | aug = K.AugmentationSequential( |
| | K.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0), |
| | K.RandomAffine(360, [0.1, 0.1], [0.7, 1.2], [30., 50.], p=1.0), |
| | K.RandomPerspective(0.5, p=1.0), |
| | data_keys=["input", "bbox", "keypoints", "mask"], # Just to define the future input here. |
| | return_transform=False, |
| | same_on_batch=False, |
| | ) |
| | # forward the operation |
| | out_tensors = aug(img_tensor, bbox, keypoints, mask) |
| | # Inverse the operation |
| | out_tensor_inv = aug.inverse(*out_tensor) |
| |
|
| | .. image:: https://discuss.pytorch.org/uploads/default/optimized/3X/2/4/24bb0f4520f547d3a321440293c1d44921ecadf8_2_690x119.jpeg |
| |
|
| | From left to right: the original image, the transformed image, and the inversed image. |
| |
|
| |
|
| | Best Practices 2: Video Augmentation |
| | ++++++++++++++++++++++++++++++++++++ |
| |
|
| | Video data is a special case of 3D volumetric data that contains both spatial and temporal information, which can be referred as 2.5D than 3D. |
| | In most applications, augmenting video data requires a static temporal dimension to have the same augmentations are performed for each frame. |
| | Thus, :py:class:`~kornia.augmentation.container.VideoSequential` can be used to do such trick as same as `nn.Sequential`. |
| | Currently, :py:class:`~kornia.augmentation.container.VideoSequential` supports data format like :math:`(B, C, T, H, W)` and :math:`(B, T, C, H, W)`. |
| |
|
| | .. code-block:: python |
| |
|
| | import kornia.augmentation as K |
| |
|
| | transform = K.VideoSequential( |
| | K.RandomAffine(360), |
| | K.RandomGrayscale(p=0.5), |
| | K.RandomAffine(p=0.5) |
| | data_format="BCTHW", |
| | same_on_frame=True |
| | ) |
| |
|
| | .. image:: https://user-images.githubusercontent.com/17788259/101993516-4625ca80-3c89-11eb-843e-0b87dca6e2b8.png |
| |
|
| |
|
| | Customization |
| | +++++++++++++ |
| |
|
| | Kornia augmentation implementations have two additional parameters compare to TorchVision, |
| | ``return_transform`` and ``same_on_batch``. The former provides the ability of undoing one geometry |
| | transformation while the latter can be used to control the randomness for a batched transformation. |
| | To enable those behaviour, you may simply set the flags to True. |
| |
|
| | .. code-block:: python |
| |
|
| | import kornia.augmentation as K |
| |
|
| | class MyAugmentationPipeline(nn.Module): |
| | def __init__(self) -> None: |
| | super(MyAugmentationPipeline, self).__init__() |
| | self.aff = K.RandomAffine( |
| | 360, return_transform=True, same_on_batch=True |
| | ) |
| | self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3, same_on_batch=True) |
| |
|
| | def forward(self, input): |
| | input, transform = self.aff(input) |
| | input, transform = self.jit((input, transform)) |
| | return input, transform |
| |
|
| | Example for semantic segmentation using low-level randomness control: |
| |
|
| | .. code-block:: python |
| |
|
| | import kornia.augmentation as K |
| |
|
| | class MyAugmentationPipeline(nn.Module): |
| | def __init__(self) -> None: |
| | super(MyAugmentationPipeline, self).__init__() |
| | self.aff = K.RandomAffine(360) |
| | self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3) |
| |
|
| | def forward(self, input, mask): |
| | assert input.shape == mask.shape, |
| | f"Input shape should be consistent with mask shape, " |
| | f"while got {input.shape}, {mask.shape}" |
| |
|
| | aff_params = self.aff.forward_parameters(input.shape) |
| | input = self.aff(input, aff_params) |
| | mask = self.aff(mask, aff_params) |
| |
|
| | jit_params = self.jit.forward_parameters(input.shape) |
| | input = self.jit(input, jit_params) |
| | mask = self.jit(mask, jit_params) |
| | return input, mask |
| |
|