python profile.py --no_grad --fb -j 3 Type Time (%) Time (ms) Calls Avg Min Max Name GPU activities: 66.1 17.761 120 148.01us 31.840us 331.94us void spatialDepthwiseConvolutionUpdateOutput,... GPU activities: 3.31 0.88993 120 7.4160us 5.3120us 9.5040us void CatArrayBatchedCopy,... GPU activities: 1.38 0.37143 60 6.1900us 1.6000us 13.568us void kernelPointwiseApply2