|
|
Image Augmentation |
|
|
================== |
|
|
|
|
|
Image Augmentation is a data augmentation method that generates more training data |
|
|
from the existing training samples. Image Augmentation is especially useful in domains |
|
|
where training data is limited or expensive to obtain like in biomedical applications. |
|
|
|
|
|
.. image:: https://github.com/kornia/data/raw/main/girona_aug.png |
|
|
:align: center |
|
|
|
|
|
Learn more: `https://paperswithcode.com/task/image-registration <https://paperswithcode.com/task/image-augmentation>`_ |
|
|
|
|
|
Kornia Augmentations |
|
|
-------------------- |
|
|
|
|
|
Kornia leverages differentiable and GPU image data augmentation through the module `kornia.augmentation <https://kornia.readthedocs.io/en/latest/augmentation.html>`_ |
|
|
by implementing the functionality to be easily used with `torch.nn.Sequential <https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html?highlight=sequential#torch.nn.Sequential>`_ |
|
|
and other advanced containers such as |
|
|
:py:class:`~kornia.augmentation.container.AugmentationSequential`, |
|
|
:py:class:`~kornia.augmentation.container.ImageSequential`, |
|
|
:py:class:`~kornia.augmentation.container.PatchSequential` and |
|
|
:py:class:`~kornia.augmentation.container.VideoSequential`. |
|
|
|
|
|
Our augmentations package is highly inspired by torchvision augmentation APIs while our intention is to not replace it. |
|
|
Kornia is a library that aligns better to OpenCV functionalities enforcing floating operators to guarantees a better precision |
|
|
without any float -> uint8 conversions plus on device acceleration. |
|
|
|
|
|
However, we provide the following guide to migrate kornia <-> torchvision. Please, checkout the `Colab: Kornia Playground <https://colab.research.google.com/drive/1T20UNAG4SdlE2n2wstuhiewve5Q81VpS#revisionId=0B4unZG1uMc-WR3NVeTBDcmRwN0NxcGNNVlUwUldPMVprb1dJPQ>`_. |
|
|
|
|
|
.. code-block:: python |
|
|
|
|
|
import kornia.augmentation as K |
|
|
import torch.nn as nn |
|
|
|
|
|
transform = nn.Sequential( |
|
|
K.RandomAffine(360), |
|
|
K.ColorJitter(0.2, 0.3, 0.2, 0.3) |
|
|
) |
|
|
|
|
|
|
|
|
Best Practices 1: Image Augmentation |
|
|
++++++++++++++++++++++++++++++++++++ |
|
|
|
|
|
Kornia augmentations provides simple on-device augmentation framework with the support of various syntax sugars |
|
|
(e.g. return transformation matrix, inverse geometric transform). Therefore, we provide advanced augmentation |
|
|
container :py:class:`~kornia.augmentation.container.AugmentationSequential` to ease the pain of building augmenation pipelines. This API would also provide predefined routines |
|
|
for automating the processing of masks, bounding boxes, and keypoints. |
|
|
|
|
|
.. code-block:: python |
|
|
|
|
|
import kornia.augmentation as K |
|
|
|
|
|
aug = K.AugmentationSequential( |
|
|
K.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0), |
|
|
K.RandomAffine(360, [0.1, 0.1], [0.7, 1.2], [30., 50.], p=1.0), |
|
|
K.RandomPerspective(0.5, p=1.0), |
|
|
data_keys=["input", "bbox", "keypoints", "mask"], # Just to define the future input here. |
|
|
return_transform=False, |
|
|
same_on_batch=False, |
|
|
) |
|
|
# forward the operation |
|
|
out_tensors = aug(img_tensor, bbox, keypoints, mask) |
|
|
# Inverse the operation |
|
|
out_tensor_inv = aug.inverse(*out_tensor) |
|
|
|
|
|
.. image:: https://discuss.pytorch.org/uploads/default/optimized/3X/2/4/24bb0f4520f547d3a321440293c1d44921ecadf8_2_690x119.jpeg |
|
|
|
|
|
From left to right: the original image, the transformed image, and the inversed image. |
|
|
|
|
|
|
|
|
Best Practices 2: Video Augmentation |
|
|
++++++++++++++++++++++++++++++++++++ |
|
|
|
|
|
Video data is a special case of 3D volumetric data that contains both spatial and temporal information, which can be referred as 2.5D than 3D. |
|
|
In most applications, augmenting video data requires a static temporal dimension to have the same augmentations are performed for each frame. |
|
|
Thus, :py:class:`~kornia.augmentation.container.VideoSequential` can be used to do such trick as same as `nn.Sequential`. |
|
|
Currently, :py:class:`~kornia.augmentation.container.VideoSequential` supports data format like :math:`(B, C, T, H, W)` and :math:`(B, T, C, H, W)`. |
|
|
|
|
|
.. code-block:: python |
|
|
|
|
|
import kornia.augmentation as K |
|
|
|
|
|
transform = K.VideoSequential( |
|
|
K.RandomAffine(360), |
|
|
K.RandomGrayscale(p=0.5), |
|
|
K.RandomAffine(p=0.5) |
|
|
data_format="BCTHW", |
|
|
same_on_frame=True |
|
|
) |
|
|
|
|
|
.. image:: https://user-images.githubusercontent.com/17788259/101993516-4625ca80-3c89-11eb-843e-0b87dca6e2b8.png |
|
|
|
|
|
|
|
|
Customization |
|
|
+++++++++++++ |
|
|
|
|
|
Kornia augmentation implementations have two additional parameters compare to TorchVision, |
|
|
``return_transform`` and ``same_on_batch``. The former provides the ability of undoing one geometry |
|
|
transformation while the latter can be used to control the randomness for a batched transformation. |
|
|
To enable those behaviour, you may simply set the flags to True. |
|
|
|
|
|
.. code-block:: python |
|
|
|
|
|
import kornia.augmentation as K |
|
|
|
|
|
class MyAugmentationPipeline(nn.Module): |
|
|
def __init__(self) -> None: |
|
|
super(MyAugmentationPipeline, self).__init__() |
|
|
self.aff = K.RandomAffine( |
|
|
360, return_transform=True, same_on_batch=True |
|
|
) |
|
|
self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3, same_on_batch=True) |
|
|
|
|
|
def forward(self, input): |
|
|
input, transform = self.aff(input) |
|
|
input, transform = self.jit((input, transform)) |
|
|
return input, transform |
|
|
|
|
|
Example for semantic segmentation using low-level randomness control: |
|
|
|
|
|
.. code-block:: python |
|
|
|
|
|
import kornia.augmentation as K |
|
|
|
|
|
class MyAugmentationPipeline(nn.Module): |
|
|
def __init__(self) -> None: |
|
|
super(MyAugmentationPipeline, self).__init__() |
|
|
self.aff = K.RandomAffine(360) |
|
|
self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3) |
|
|
|
|
|
def forward(self, input, mask): |
|
|
assert input.shape == mask.shape, |
|
|
f"Input shape should be consistent with mask shape, " |
|
|
f"while got {input.shape}, {mask.shape}" |
|
|
|
|
|
aff_params = self.aff.forward_parameters(input.shape) |
|
|
input = self.aff(input, aff_params) |
|
|
mask = self.aff(mask, aff_params) |
|
|
|
|
|
jit_params = self.jit.forward_parameters(input.shape) |
|
|
input = self.jit(input, jit_params) |
|
|
mask = self.jit(mask, jit_params) |
|
|
return input, mask |
|
|
|