| # Tutorial 1: Learn about Configs | |
| We incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments. | |
| If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config. | |
| ## Modify config through script arguments | |
| When submitting jobs using "tools/train.py" or "tools/test.py", you may specify `--cfg-options` to in-place modify the config. | |
| - Update config keys of dict chains. | |
| The config options can be specified following the order of the dict keys in the original config. | |
| For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode. | |
| - Update keys inside a list of configs. | |
| Some config dicts are composed as a list in your config. For example, the training pipeline `data.train.pipeline` is normally a list | |
| e.g. `[dict(type='LoadImageFromFile'), ...]`. If you want to change `'LoadImageFromFile'` to `'LoadImageFromWebcam'` in the pipeline, | |
| you may specify `--cfg-options data.train.pipeline.0.type=LoadImageFromWebcam`. | |
| - Update values of list/tuples. | |
| If the value to be updated is a list or a tuple. For example, the config file normally sets `workflow=[('train', 1)]`. If you want to | |
| change this key, you may specify `--cfg-options workflow="[(train,1),(val,1)]"`. Note that the quotation mark " is necessary to | |
| support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value. | |
| ## Config File Structure | |
| There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime. | |
| Many methods could be easily constructed with one of each like Faster R-CNN, Mask R-CNN, Cascade R-CNN, RPN, SSD. | |
| The configs that are composed by components from `_base_` are called _primitive_. | |
| For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3. | |
| For easy understanding, we recommend contributors to inherit from existing methods. | |
| For example, if some modification is made base on Faster R-CNN, user may first inherit the basic Faster R-CNN structure by specifying `_base_ = ../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py`, then modify the necessary fields in the config files. | |
| If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`, | |
| Please refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html) for detailed documentation. | |
| ## Config Name Style | |
| We follow the below style to name config files. Contributors are advised to follow the same style. | |
| ``` | |
| {model}_[model setting]_{backbone}_{neck}_[norm setting]_[misc]_[gpu x batch_per_gpu]_{schedule}_{dataset} | |
| ``` | |
| `{xxx}` is required field and `[yyy]` is optional. | |
| - `{model}`: model type like `faster_rcnn`, `mask_rcnn`, etc. | |
| - `[model setting]`: specific setting for some model, like `without_semantic` for `htc`, `moment` for `reppoints`, etc. | |
| - `{backbone}`: backbone type like `r50` (ResNet-50), `x101` (ResNeXt-101). | |
| - `{neck}`: neck type like `fpn`, `pafpn`, `nasfpn`, `c4`. | |
| - `[norm_setting]`: `bn` (Batch Normalization) is used unless specified, other norm layer type could be `gn` (Group Normalization), `syncbn` (Synchronized Batch Normalization). | |
| `gn-head`/`gn-neck` indicates GN is applied in head/neck only, while `gn-all` means GN is applied in the entire model, e.g. backbone, neck, head. | |
| - `[misc]`: miscellaneous setting/plugins of model, e.g. `dconv`, `gcb`, `attention`, `albu`, `mstrain`. | |
| - `[gpu x batch_per_gpu]`: GPUs and samples per GPU, `8x2` is used by default. | |
| - `{schedule}`: training schedule, options are `1x`, `2x`, `20e`, etc. | |
| `1x` and `2x` means 12 epochs and 24 epochs respectively. | |
| `20e` is adopted in cascade models, which denotes 20 epochs. | |
| For `1x`/`2x`, initial learning rate decays by a factor of 10 at the 8/16th and 11/22th epochs. | |
| For `20e`, initial learning rate decays by a factor of 10 at the 16th and 19th epochs. | |
| - `{dataset}`: dataset like `coco`, `cityscapes`, `voc_0712`, `wider_face`. | |
| ## Deprecated train_cfg/test_cfg | |
| The `train_cfg` and `test_cfg` are deprecated in config file, please specify them in the model config. The original config structure is as below. | |
| ```python | |
| # deprecated | |
| model = dict( | |
| type=..., | |
| ... | |
| ) | |
| train_cfg=dict(...) | |
| test_cfg=dict(...) | |
| ``` | |
| The migration example is as below. | |
| ```python | |
| # recommended | |
| model = dict( | |
| type=..., | |
| ... | |
| train_cfg=dict(...), | |
| test_cfg=dict(...), | |
| ) | |
| ``` | |
| ## An Example of Mask R-CNN | |
| To help the users have a basic idea of a complete config and the modules in a modern detection system, | |
| we make brief comments on the config of Mask R-CNN using ResNet50 and FPN as the following. | |
| For more detailed usage and the corresponding alternative for each modules, please refer to the API documentation. | |
| ```python | |
| model = dict( | |
| type='MaskRCNN', # The name of detector | |
| backbone=dict( # The config of backbone | |
| type='ResNet', # The type of the backbone, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/backbones/resnet.py#L308 for more details. | |
| depth=50, # The depth of backbone, usually it is 50 or 101 for ResNet and ResNext backbones. | |
| num_stages=4, # Number of stages of the backbone. | |
| out_indices=(0, 1, 2, 3), # The index of output feature maps produced in each stages | |
| frozen_stages=1, # The weights in the first 1 stage are frozen | |
| norm_cfg=dict( # The config of normalization layers. | |
| type='BN', # Type of norm layer, usually it is BN or GN | |
| requires_grad=True), # Whether to train the gamma and beta in BN | |
| norm_eval=True, # Whether to freeze the statistics in BN | |
| style='pytorch', # The style of backbone, 'pytorch' means that stride 2 layers are in 3x3 conv, 'caffe' means stride 2 layers are in 1x1 convs. | |
| init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), # The ImageNet pretrained backbone to be loaded | |
| neck=dict( | |
| type='FPN', # The neck of detector is FPN. We also support 'NASFPN', 'PAFPN', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/necks/fpn.py#L10 for more details. | |
| in_channels=[256, 512, 1024, 2048], # The input channels, this is consistent with the output channels of backbone | |
| out_channels=256, # The output channels of each level of the pyramid feature map | |
| num_outs=5), # The number of output scales | |
| rpn_head=dict( | |
| type='RPNHead', # The type of RPN head is 'RPNHead', we also support 'GARPNHead', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/dense_heads/rpn_head.py#L12 for more details. | |
| in_channels=256, # The input channels of each input feature map, this is consistent with the output channels of neck | |
| feat_channels=256, # Feature channels of convolutional layers in the head. | |
| anchor_generator=dict( # The config of anchor generator | |
| type='AnchorGenerator', # Most of methods use AnchorGenerator, SSD Detectors uses `SSDAnchorGenerator`. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/anchor/anchor_generator.py#L10 for more details | |
| scales=[8], # Basic scale of the anchor, the area of the anchor in one position of a feature map will be scale * base_sizes | |
| ratios=[0.5, 1.0, 2.0], # The ratio between height and width. | |
| strides=[4, 8, 16, 32, 64]), # The strides of the anchor generator. This is consistent with the FPN feature strides. The strides will be taken as base_sizes if base_sizes is not set. | |
| bbox_coder=dict( # Config of box coder to encode and decode the boxes during training and testing | |
| type='DeltaXYWHBBoxCoder', # Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/coder/delta_xywh_bbox_coder.py#L9 for more details. | |
| target_means=[0.0, 0.0, 0.0, 0.0], # The target means used to encode and decode boxes | |
| target_stds=[1.0, 1.0, 1.0, 1.0]), # The standard variance used to encode and decode boxes | |
| loss_cls=dict( # Config of loss function for the classification branch | |
| type='CrossEntropyLoss', # Type of loss for classification branch, we also support FocalLoss etc. | |
| use_sigmoid=True, # RPN usually perform two-class classification, so it usually uses sigmoid function. | |
| loss_weight=1.0), # Loss weight of the classification branch. | |
| loss_bbox=dict( # Config of loss function for the regression branch. | |
| type='L1Loss', # Type of loss, we also support many IoU Losses and smooth L1-loss, etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/losses/smooth_l1_loss.py#L56 for implementation. | |
| loss_weight=1.0)), # Loss weight of the regression branch. | |
| roi_head=dict( # RoIHead encapsulates the second stage of two-stage/cascade detectors. | |
| type='StandardRoIHead', # Type of the RoI head. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/standard_roi_head.py#L10 for implementation. | |
| bbox_roi_extractor=dict( # RoI feature extractor for bbox regression. | |
| type='SingleRoIExtractor', # Type of the RoI feature extractor, most of methods uses SingleRoIExtractor. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/roi_extractors/single_level.py#L10 for details. | |
| roi_layer=dict( # Config of RoI Layer | |
| type='RoIAlign', # Type of RoI Layer, DeformRoIPoolingPack and ModulatedDeformRoIPoolingPack are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/ops/roi_align/roi_align.py#L79 for details. | |
| output_size=7, # The output size of feature maps. | |
| sampling_ratio=0), # Sampling ratio when extracting the RoI features. 0 means adaptive ratio. | |
| out_channels=256, # output channels of the extracted feature. | |
| featmap_strides=[4, 8, 16, 32]), # Strides of multi-scale feature maps. It should be consistent to the architecture of the backbone. | |
| bbox_head=dict( # Config of box head in the RoIHead. | |
| type='Shared2FCBBoxHead', # Type of the bbox head, Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py#L177 for implementation details. | |
| in_channels=256, # Input channels for bbox head. This is consistent with the out_channels in roi_extractor | |
| fc_out_channels=1024, # Output feature channels of FC layers. | |
| roi_feat_size=7, # Size of RoI features | |
| num_classes=80, # Number of classes for classification | |
| bbox_coder=dict( # Box coder used in the second stage. | |
| type='DeltaXYWHBBoxCoder', # Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods. | |
| target_means=[0.0, 0.0, 0.0, 0.0], # Means used to encode and decode box | |
| target_stds=[0.1, 0.1, 0.2, 0.2]), # Standard variance for encoding and decoding. It is smaller since the boxes are more accurate. [0.1, 0.1, 0.2, 0.2] is a conventional setting. | |
| reg_class_agnostic=False, # Whether the regression is class agnostic. | |
| loss_cls=dict( # Config of loss function for the classification branch | |
| type='CrossEntropyLoss', # Type of loss for classification branch, we also support FocalLoss etc. | |
| use_sigmoid=False, # Whether to use sigmoid. | |
| loss_weight=1.0), # Loss weight of the classification branch. | |
| loss_bbox=dict( # Config of loss function for the regression branch. | |
| type='L1Loss', # Type of loss, we also support many IoU Losses and smooth L1-loss, etc. | |
| loss_weight=1.0)), # Loss weight of the regression branch. | |
| mask_roi_extractor=dict( # RoI feature extractor for mask generation. | |
| type='SingleRoIExtractor', # Type of the RoI feature extractor, most of methods uses SingleRoIExtractor. | |
| roi_layer=dict( # Config of RoI Layer that extracts features for instance segmentation | |
| type='RoIAlign', # Type of RoI Layer, DeformRoIPoolingPack and ModulatedDeformRoIPoolingPack are also supported | |
| output_size=14, # The output size of feature maps. | |
| sampling_ratio=0), # Sampling ratio when extracting the RoI features. | |
| out_channels=256, # Output channels of the extracted feature. | |
| featmap_strides=[4, 8, 16, 32]), # Strides of multi-scale feature maps. | |
| mask_head=dict( # Mask prediction head | |
| type='FCNMaskHead', # Type of mask head, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/mask_heads/fcn_mask_head.py#L21 for implementation details. | |
| num_convs=4, # Number of convolutional layers in mask head. | |
| in_channels=256, # Input channels, should be consistent with the output channels of mask roi extractor. | |
| conv_out_channels=256, # Output channels of the convolutional layer. | |
| num_classes=80, # Number of class to be segmented. | |
| loss_mask=dict( # Config of loss function for the mask branch. | |
| type='CrossEntropyLoss', # Type of loss used for segmentation | |
| use_mask=True, # Whether to only train the mask in the correct class. | |
| loss_weight=1.0))), # Loss weight of mask branch. | |
| train_cfg = dict( # Config of training hyperparameters for rpn and rcnn | |
| rpn=dict( # Training config of rpn | |
| assigner=dict( # Config of assigner | |
| type='MaxIoUAssigner', # Type of assigner, MaxIoUAssigner is used for many common detectors. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10 for more details. | |
| pos_iou_thr=0.7, # IoU >= threshold 0.7 will be taken as positive samples | |
| neg_iou_thr=0.3, # IoU < threshold 0.3 will be taken as negative samples | |
| min_pos_iou=0.3, # The minimal IoU threshold to take boxes as positive samples | |
| match_low_quality=True, # Whether to match the boxes under low quality (see API doc for more details). | |
| ignore_iof_thr=-1), # IoF threshold for ignoring bboxes | |
| sampler=dict( # Config of positive/negative sampler | |
| type='RandomSampler', # Type of sampler, PseudoSampler and other samplers are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8 for implementation details. | |
| num=256, # Number of samples | |
| pos_fraction=0.5, # The ratio of positive samples in the total samples. | |
| neg_pos_ub=-1, # The upper bound of negative samples based on the number of positive samples. | |
| add_gt_as_proposals=False), # Whether add GT as proposals after sampling. | |
| allowed_border=-1, # The border allowed after padding for valid anchors. | |
| pos_weight=-1, # The weight of positive samples during training. | |
| debug=False), # Whether to set the debug mode | |
| rpn_proposal=dict( # The config to generate proposals during training | |
| nms_across_levels=False, # Whether to do NMS for boxes across levels. Only work in `GARPNHead`, naive rpn does not support do nms cross levels. | |
| nms_pre=2000, # The number of boxes before NMS | |
| nms_post=1000, # The number of boxes to be kept by NMS, Only work in `GARPNHead`. | |
| max_per_img=1000, # The number of boxes to be kept after NMS. | |
| nms=dict( # Config of NMS | |
| type='nms', # Type of NMS | |
| iou_threshold=0.7 # NMS threshold | |
| ), | |
| min_bbox_size=0), # The allowed minimal box size | |
| rcnn=dict( # The config for the roi heads. | |
| assigner=dict( # Config of assigner for second stage, this is different for that in rpn | |
| type='MaxIoUAssigner', # Type of assigner, MaxIoUAssigner is used for all roi_heads for now. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10 for more details. | |
| pos_iou_thr=0.5, # IoU >= threshold 0.5 will be taken as positive samples | |
| neg_iou_thr=0.5, # IoU < threshold 0.5 will be taken as negative samples | |
| min_pos_iou=0.5, # The minimal IoU threshold to take boxes as positive samples | |
| match_low_quality=False, # Whether to match the boxes under low quality (see API doc for more details). | |
| ignore_iof_thr=-1), # IoF threshold for ignoring bboxes | |
| sampler=dict( | |
| type='RandomSampler', # Type of sampler, PseudoSampler and other samplers are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8 for implementation details. | |
| num=512, # Number of samples | |
| pos_fraction=0.25, # The ratio of positive samples in the total samples. | |
| neg_pos_ub=-1, # The upper bound of negative samples based on the number of positive samples. | |
| add_gt_as_proposals=True | |
| ), # Whether add GT as proposals after sampling. | |
| mask_size=28, # Size of mask | |
| pos_weight=-1, # The weight of positive samples during training. | |
| debug=False)), # Whether to set the debug mode | |
| test_cfg = dict( # Config for testing hyperparameters for rpn and rcnn | |
| rpn=dict( # The config to generate proposals during testing | |
| nms_across_levels=False, # Whether to do NMS for boxes across levels. Only work in `GARPNHead`, naive rpn does not support do nms cross levels. | |
| nms_pre=1000, # The number of boxes before NMS | |
| nms_post=1000, # The number of boxes to be kept by NMS, Only work in `GARPNHead`. | |
| max_per_img=1000, # The number of boxes to be kept after NMS. | |
| nms=dict( # Config of NMS | |
| type='nms', #Type of NMS | |
| iou_threshold=0.7 # NMS threshold | |
| ), | |
| min_bbox_size=0), # The allowed minimal box size | |
| rcnn=dict( # The config for the roi heads. | |
| score_thr=0.05, # Threshold to filter out boxes | |
| nms=dict( # Config of NMS in the second stage | |
| type='nms', # Type of NMS | |
| iou_thr=0.5), # NMS threshold | |
| max_per_img=100, # Max number of detections of each image | |
| mask_thr_binary=0.5))) # Threshold of mask prediction | |
| dataset_type = 'CocoDataset' # Dataset type, this will be used to define the dataset | |
| data_root = 'data/coco/' # Root path of data | |
| img_norm_cfg = dict( # Image normalization config to normalize the input images | |
| mean=[123.675, 116.28, 103.53], # Mean values used to pre-training the pre-trained backbone models | |
| std=[58.395, 57.12, 57.375], # Standard variance used to pre-training the pre-trained backbone models | |
| to_rgb=True | |
| ) # The channel orders of image used to pre-training the pre-trained backbone models | |
| train_pipeline = [ # Training pipeline | |
| dict(type='LoadImageFromFile'), # First pipeline to load images from file path | |
| dict( | |
| type='LoadAnnotations', # Second pipeline to load annotations for current image | |
| with_bbox=True, # Whether to use bounding box, True for detection | |
| with_mask=True, # Whether to use instance mask, True for instance segmentation | |
| poly2mask=False), # Whether to convert the polygon mask to instance mask, set False for acceleration and to save memory | |
| dict( | |
| type='Resize', # Augmentation pipeline that resize the images and their annotations | |
| img_scale=(1333, 800), # The largest scale of image | |
| keep_ratio=True | |
| ), # whether to keep the ratio between height and width. | |
| dict( | |
| type='RandomFlip', # Augmentation pipeline that flip the images and their annotations | |
| flip_ratio=0.5), # The ratio or probability to flip | |
| dict( | |
| type='Normalize', # Augmentation pipeline that normalize the input images | |
| mean=[123.675, 116.28, 103.53], # These keys are the same of img_norm_cfg since the | |
| std=[58.395, 57.12, 57.375], # keys of img_norm_cfg are used here as arguments | |
| to_rgb=True), | |
| dict( | |
| type='Pad', # Padding config | |
| size_divisor=32), # The number the padded images should be divisible | |
| dict(type='DefaultFormatBundle'), # Default format bundle to gather data in the pipeline | |
| dict( | |
| type='Collect', # Pipeline that decides which keys in the data should be passed to the detector | |
| keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']) | |
| ] | |
| test_pipeline = [ | |
| dict(type='LoadImageFromFile'), # First pipeline to load images from file path | |
| dict( | |
| type='MultiScaleFlipAug', # An encapsulation that encapsulates the testing augmentations | |
| img_scale=(1333, 800), # Decides the largest scale for testing, used for the Resize pipeline | |
| flip=False, # Whether to flip images during testing | |
| transforms=[ | |
| dict(type='Resize', # Use resize augmentation | |
| keep_ratio=True), # Whether to keep the ratio between height and width, the img_scale set here will be suppressed by the img_scale set above. | |
| dict(type='RandomFlip'), # Thought RandomFlip is added in pipeline, it is not used because flip=False | |
| dict( | |
| type='Normalize', # Normalization config, the values are from img_norm_cfg | |
| mean=[123.675, 116.28, 103.53], | |
| std=[58.395, 57.12, 57.375], | |
| to_rgb=True), | |
| dict( | |
| type='Pad', # Padding config to pad images divisible by 32. | |
| size_divisor=32), | |
| dict( | |
| type='ImageToTensor', # convert image to tensor | |
| keys=['img']), | |
| dict( | |
| type='Collect', # Collect pipeline that collect necessary keys for testing. | |
| keys=['img']) | |
| ]) | |
| ] | |
| data = dict( | |
| samples_per_gpu=2, # Batch size of a single GPU | |
| workers_per_gpu=2, # Worker to pre-fetch data for each single GPU | |
| train=dict( # Train dataset config | |
| type='CocoDataset', # Type of dataset, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/coco.py#L19 for details. | |
| ann_file='data/coco/annotations/instances_train2017.json', # Path of annotation file | |
| img_prefix='data/coco/train2017/', # Prefix of image path | |
| pipeline=[ # pipeline, this is passed by the train_pipeline created before. | |
| dict(type='LoadImageFromFile'), | |
| dict( | |
| type='LoadAnnotations', | |
| with_bbox=True, | |
| with_mask=True, | |
| poly2mask=False), | |
| dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), | |
| dict(type='RandomFlip', flip_ratio=0.5), | |
| dict( | |
| type='Normalize', | |
| mean=[123.675, 116.28, 103.53], | |
| std=[58.395, 57.12, 57.375], | |
| to_rgb=True), | |
| dict(type='Pad', size_divisor=32), | |
| dict(type='DefaultFormatBundle'), | |
| dict( | |
| type='Collect', | |
| keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']) | |
| ]), | |
| val=dict( # Validation dataset config | |
| type='CocoDataset', | |
| ann_file='data/coco/annotations/instances_val2017.json', | |
| img_prefix='data/coco/val2017/', | |
| pipeline=[ # Pipeline is passed by test_pipeline created before | |
| dict(type='LoadImageFromFile'), | |
| dict( | |
| type='MultiScaleFlipAug', | |
| img_scale=(1333, 800), | |
| flip=False, | |
| transforms=[ | |
| dict(type='Resize', keep_ratio=True), | |
| dict(type='RandomFlip'), | |
| dict( | |
| type='Normalize', | |
| mean=[123.675, 116.28, 103.53], | |
| std=[58.395, 57.12, 57.375], | |
| to_rgb=True), | |
| dict(type='Pad', size_divisor=32), | |
| dict(type='ImageToTensor', keys=['img']), | |
| dict(type='Collect', keys=['img']) | |
| ]) | |
| ]), | |
| test=dict( # Test dataset config, modify the ann_file for test-dev/test submission | |
| type='CocoDataset', | |
| ann_file='data/coco/annotations/instances_val2017.json', | |
| img_prefix='data/coco/val2017/', | |
| pipeline=[ # Pipeline is passed by test_pipeline created before | |
| dict(type='LoadImageFromFile'), | |
| dict( | |
| type='MultiScaleFlipAug', | |
| img_scale=(1333, 800), | |
| flip=False, | |
| transforms=[ | |
| dict(type='Resize', keep_ratio=True), | |
| dict(type='RandomFlip'), | |
| dict( | |
| type='Normalize', | |
| mean=[123.675, 116.28, 103.53], | |
| std=[58.395, 57.12, 57.375], | |
| to_rgb=True), | |
| dict(type='Pad', size_divisor=32), | |
| dict(type='ImageToTensor', keys=['img']), | |
| dict(type='Collect', keys=['img']) | |
| ]) | |
| ], | |
| samples_per_gpu=2 # Batch size of a single GPU used in testing | |
| )) | |
| evaluation = dict( # The config to build the evaluation hook, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/evaluation/eval_hooks.py#L7 for more details. | |
| interval=1, # Evaluation interval | |
| metric=['bbox', 'segm']) # Metrics used during evaluation | |
| optimizer = dict( # Config used to build optimizer, support all the optimizers in PyTorch whose arguments are also the same as those in PyTorch | |
| type='SGD', # Type of optimizers, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/optimizer/default_constructor.py#L13 for more details | |
| lr=0.02, # Learning rate of optimizers, see detail usages of the parameters in the documentation of PyTorch | |
| momentum=0.9, # Momentum | |
| weight_decay=0.0001) # Weight decay of SGD | |
| optimizer_config = dict( # Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8 for implementation details. | |
| grad_clip=None) # Most of the methods do not use gradient clip | |
| lr_config = dict( # Learning rate scheduler config used to register LrUpdater hook | |
| policy='step', # The policy of scheduler, also support CosineAnnealing, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L9. | |
| warmup='linear', # The warmup policy, also support `exp` and `constant`. | |
| warmup_iters=500, # The number of iterations for warmup | |
| warmup_ratio= | |
| 0.001, # The ratio of the starting learning rate used for warmup | |
| step=[8, 11]) # Steps to decay the learning rate | |
| runner = dict( | |
| type='EpochBasedRunner', # Type of runner to use (i.e. IterBasedRunner or EpochBasedRunner) | |
| max_epochs=12) # Runner that runs the workflow in total max_epochs. For IterBasedRunner use `max_iters` | |
| checkpoint_config = dict( # Config to set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation. | |
| interval=1) # The save interval is 1 | |
| log_config = dict( # config to register logger hook | |
| interval=50, # Interval to print the log | |
| hooks=[ | |
| dict(type='TextLoggerHook', by_epoch=False), | |
| dict(type='TensorboardLoggerHook', by_epoch=False), | |
| dict(type='MMDetWandbHook', by_epoch=False, # The Wandb logger is also supported, It requires `wandb` to be installed. | |
| init_kwargs={'entity': "OpenMMLab", # The entity used to log on Wandb | |
| 'project': "MMDet", # Project name in WandB | |
| 'config': cfg_dict}), # Check https://docs.wandb.ai/ref/python/init for more init arguments. | |
| # MMDetWandbHook is mmdet implementation of WandbLoggerHook. ClearMLLoggerHook, DvcliveLoggerHook, MlflowLoggerHook, NeptuneLoggerHook, PaviLoggerHook, SegmindLoggerHook are also supported based on MMCV implementation. | |
| ]) # The logger used to record the training process. | |
| dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set. | |
| log_level = 'INFO' # The level of logging. | |
| load_from = None # load models as a pre-trained model from a given path. This will not resume training. | |
| resume_from = None # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved. | |
| workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 12 epochs according to the total_epochs. | |
| work_dir = 'work_dir' # Directory to save the model checkpoints and logs for the current experiments. | |
| ``` | |
| ## FAQ | |
| ### Ignore some fields in the base configs | |
| Sometimes, you may set `_delete_=True` to ignore some of fields in base configs. | |
| You may refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#inherit-from-base-config-with-ignored-fields) for simple illustration. | |
| In MMDetection, for example, to change the backbone of Mask R-CNN with the following config. | |
| ```python | |
| model = dict( | |
| type='MaskRCNN', | |
| pretrained='torchvision://resnet50', | |
| backbone=dict( | |
| type='ResNet', | |
| depth=50, | |
| num_stages=4, | |
| out_indices=(0, 1, 2, 3), | |
| frozen_stages=1, | |
| norm_cfg=dict(type='BN', requires_grad=True), | |
| norm_eval=True, | |
| style='pytorch'), | |
| neck=dict(...), | |
| rpn_head=dict(...), | |
| roi_head=dict(...)) | |
| ``` | |
| `ResNet` and `HRNet` use different keywords to construct. | |
| ```python | |
| _base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py' | |
| model = dict( | |
| pretrained='open-mmlab://msra/hrnetv2_w32', | |
| backbone=dict( | |
| _delete_=True, | |
| type='HRNet', | |
| extra=dict( | |
| stage1=dict( | |
| num_modules=1, | |
| num_branches=1, | |
| block='BOTTLENECK', | |
| num_blocks=(4, ), | |
| num_channels=(64, )), | |
| stage2=dict( | |
| num_modules=1, | |
| num_branches=2, | |
| block='BASIC', | |
| num_blocks=(4, 4), | |
| num_channels=(32, 64)), | |
| stage3=dict( | |
| num_modules=4, | |
| num_branches=3, | |
| block='BASIC', | |
| num_blocks=(4, 4, 4), | |
| num_channels=(32, 64, 128)), | |
| stage4=dict( | |
| num_modules=3, | |
| num_branches=4, | |
| block='BASIC', | |
| num_blocks=(4, 4, 4, 4), | |
| num_channels=(32, 64, 128, 256)))), | |
| neck=dict(...)) | |
| ``` | |
| The `_delete_=True` would replace all old keys in `backbone` field with new keys. | |
| ### Use intermediate variables in configs | |
| Some intermediate variables are used in the configs files, like `train_pipeline`/`test_pipeline` in datasets. | |
| It's worth noting that when modifying intermediate variables in the children configs, user need to pass the intermediate variables into corresponding fields again. | |
| For example, we would like to use multi scale strategy to train a Mask R-CNN. `train_pipeline`/`test_pipeline` are intermediate variable we would like modify. | |
| ```python | |
| _base_ = './mask_rcnn_r50_fpn_1x_coco.py' | |
| img_norm_cfg = dict( | |
| mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) | |
| train_pipeline = [ | |
| dict(type='LoadImageFromFile'), | |
| dict(type='LoadAnnotations', with_bbox=True, with_mask=True), | |
| dict( | |
| type='Resize', | |
| img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736), | |
| (1333, 768), (1333, 800)], | |
| multiscale_mode="value", | |
| keep_ratio=True), | |
| dict(type='RandomFlip', flip_ratio=0.5), | |
| dict(type='Normalize', **img_norm_cfg), | |
| dict(type='Pad', size_divisor=32), | |
| dict(type='DefaultFormatBundle'), | |
| dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']), | |
| ] | |
| test_pipeline = [ | |
| dict(type='LoadImageFromFile'), | |
| dict( | |
| type='MultiScaleFlipAug', | |
| img_scale=(1333, 800), | |
| flip=False, | |
| transforms=[ | |
| dict(type='Resize', keep_ratio=True), | |
| dict(type='RandomFlip'), | |
| dict(type='Normalize', **img_norm_cfg), | |
| dict(type='Pad', size_divisor=32), | |
| dict(type='ImageToTensor', keys=['img']), | |
| dict(type='Collect', keys=['img']), | |
| ]) | |
| ] | |
| data = dict( | |
| train=dict(pipeline=train_pipeline), | |
| val=dict(pipeline=test_pipeline), | |
| test=dict(pipeline=test_pipeline)) | |
| ``` | |
| We first define the new `train_pipeline`/`test_pipeline` and pass them into `data`. | |
| Similarly, if we would like to switch from `SyncBN` to `BN` or `MMSyncBN`, we need to substitute every `norm_cfg` in the config. | |
| ```python | |
| _base_ = './mask_rcnn_r50_fpn_1x_coco.py' | |
| norm_cfg = dict(type='BN', requires_grad=True) | |
| model = dict( | |
| backbone=dict(norm_cfg=norm_cfg), | |
| neck=dict(norm_cfg=norm_cfg), | |
| ...) | |
| ``` | |