File size: 6,166 Bytes
032e687
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
# Using Pretrained Backbones
This document provides a brief intro of the usage of builtin backbones in detrex.

## ResNet Backbone
### Build ResNet Default Backbone
We modified detectron2 default builtin ResNet models to fit the Lazy Config system. Here we introduce how to implement `ResNet` models or modify it in your own config files.

- Build the default `ResNet-50` backbone

```python
# config.py

from detrex.modeling.backbone import ResNet, BasicStem

from detectron2.config import LazyCall as L

backbone=L(ResNet)(
    stem=L(BasicStem)(in_channels=3, out_channels=64, norm="FrozenBN"),
    stages=L(ResNet.make_default_stages)(
        depth=50,
        stride_in_1x1=False,
        norm="FrozenBN",
    ),
    out_features=["res2", "res3", "res4", "res5"],
    freeze_at=1,
)
```
**Notes:**
- `stem`: The standard ResNet stem with a `conv`, `relu` and `max_pool`, we usually set `norm="FrozenBN"` to use `FrozenBatchNorm2D` layer in backbone.
- `ResNet.make_default_stages`: This is method which builds the regular ResNet intermediate stages. Set `depth={18, 34, 50, 101, 152}` to build `ResNet-depth` models.
- `out_features`: Set `["res2", "res3"]` to return the intermediate features from the second and third stages.
- `freeze_at`: Set `freeze_at=1` to frozen the backbone at the first stage.

### Build the Modified ResNet Models
- Build `ResNet-DC5` models

```python
from detrex.modeling.backbone import ResNet, BasicStem, make_stage

from detectron2.config import LazyCall as L

backbone=L(ResNet)(
    stem=L(BasicStem)(in_channels=3, out_channels=64, norm="FrozenBN"),
    stages=L(make_stage)(
        depth=50,
        stride_in_1x1=False,
        norm="FrozenBN",
        res5_dilation=2,
    ),
    out_features=["res2", "res3", "res4", "res5"],
    freeze_at=1,
)
```
- Using the modified `make_stage` function and set `res5_dilation=2` to build `ResNet-DC5` models.
- More details can be found in `make_stage` function [API documentation](https://detrex.readthedocs.io/en/latest/modules/detrex.modeling.html#detrex.modeling.backbone.ResNet.make_stage)


## Timm Backbone
detrex provides a wrapper for [Pytorch Image Models(timm)](https://github.com/rwightman/pytorch-image-models) to use its pretrained backbone networks. Support you want to use the pretrained [ResNet-152-D](https://github.com/rwightman/pytorch-image-models/blob/a520da9b495422bc773fb5dfe10819acb8bd7c5c/timm/models/resnet.py#L867) model as the backbone of `DINO`, you can modify your config as following:

```python
from detectron2.config import LazyCall as L
from detectron2.modeling import ShapeSpec
from detectron2.layers import FrozenBatchNorm2d

# inherit configs from "dino_r50_4scale_12ep"
from .dino_r50_4scale_12ep import (
    train,
    dataloader,
    optimizer,
    lr_multiplier,
)
from .models.dino_r50 import model

from detrex.modeling.backbone import TimmBackbone

# modify backbone configs
model.backbone = L(TimmBackbone)(
    model_name="resnet152d",  # name in timm
    features_only=True,
    pretrained=True,
    in_channels=3,
    out_indices=(1, 2, 3),
    norm_layer=FrozenBatchNorm2d,
)

# modify neck configs
model.neck.input_shapes = {
    "p1": ShapeSpec(channels=256),
    "p2": ShapeSpec(channels=512),
    "p3": ShapeSpec(channels=1024),
}
model.neck.in_features = ["p1", "p2", "p3"]

# modify training configs
train.init_checkpoint = ""
```
- Set `pretrained=True` which will automatically download pretrained weights from timm.
- Set `features_only=True` to turn timm models into feature extractor.
- Set `out_indices=(1, 2, 3)` which will return the intermediate output feature dict as `{"p1": torch.Tensor, "p2": torch.Tensor, "p3": torch.Tensor}`. 
- Set `norm_layer=nn.Module` to specify the norm layers in backbone, e.g., `norm_layer=FrozenBatchNorm2d` to freeze the norm layers.
- If you want to use timm backbone with your own pretrained weight, please set `pretrained=False` and update `train.init_checkpoint = "path/to/your/own/pretrained_weight/"`

More details can be found in [timm_example.py](https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/timm_example.py)

## Torchvision Backbone
detrex also provides a wrapper for [Torchvision](https://github.com/IDEA-Research/detrex/blob/main/detrex/modeling/backbone/torchvision_backbone.py) to use its pretrained backbone networks. Support you want to use [ResNet-50] model as the backbone of `DINO`, you can modify your config as following:

```python
from detectron2.config import LazyCall as L
from detectron2.modeling import ShapeSpec

# inherit configs from "dino_r50_4scale_12ep"
from .dino_r50_4scale_12ep import (
    train,
    dataloader,
    optimizer,
    lr_multiplier,
)
from .models.dino_r50 import model

from detrex.modeling.backbone import TorchvisionBackbone

# modify backbone configs
model.backbone = L(TorchvisionBackbone)(
    model_name="resnet50",
    pretrained=True,
    # specify the return nodes
    return_nodes = {
        "layer2": "res3",
        "layer3": "res4",
        "layer4": "res5",
    },
)

# modify neck configs
model.neck.input_shapes = {
    "res3": ShapeSpec(channels=512),
    "res4": ShapeSpec(channels=1024),
    "res5": ShapeSpec(channels=2048),
}
model.neck.in_features = ["res3", "res4", "res5"]

# modify training configs
train.init_checkpoint = ""
```
After **torchvision 1.10**, torchvision provides `torchvision.models.feature_extraction` package for feature extraction utilities which help the users to access intermediate outputs of the model. More details can be found in [Feature extraction for model inspection](https://pytorch.org/vision/stable/feature_extraction.html). 

Users need to specify the `return_nodes` args to be the output nodes for extracted features, which requires the users to be familiar with the node naming. Please check the [About Node Names](https://pytorch.org/vision/stable/feature_extraction.html#) part of the official documentation for more details. Or check the usage of [get_graph_node_names](https://pytorch.org/vision/stable/generated/torchvision.models.feature_extraction.get_graph_node_names.html#torchvision.models.feature_extraction.get_graph_node_names).