v2.0.0
The OpenMMLab team released a new generation of training engine MMEngine at the World Artificial Intelligence Conference on September 1, 2022. It is a foundational library for training deep learning models. Compared with MMCV, it provides a universal and powerful runner, an open architecture with a more unified interface, and a more customizable training process.
The OpenMMLab team released MMCV v2.0.0 on April 6, 2023. In the 2.x version, it has the following significant changes:
(1) It removed the following components:
mmcv.fileiomodule, removed in PR #2179. FileIO module from mmengine will be used wherever required.mmcv.runner,mmcv.parallel,mmcv. engineandmmcv.device, removed in PR #2216.- All classes in
mmcv.utils(egConfigandRegistry) and many functions, removed in PR #2217. Only a few functions related to mmcv are reserved. mmcv.onnx,mmcv.tensorrtmodules and related functions, removed in PR #2225.- Removed all root registrars in MMCV and registered classes or functions to the root registrar in MMEngine.
(2) It added the mmcv.transforms data transformation module.
(3) It renamed the package name mmcv to mmcv-lite and mmcv-full to mmcv in PR #2235. Also, change the default value of the environment variable MMCV_WITH_OPS from 0 to 1.
| MMCV < 2.0 | MMCV >= 2.0 |
|---|---|
|
|
v1.3.18
Some ops have different implementations on different devices. Lots of macros and type checks are scattered in several files, which makes the code hard to maintain. For example:
if (input.device().is_cuda()) {
#ifdef MMCV_WITH_CUDA
CHECK_CUDA_INPUT(input);
CHECK_CUDA_INPUT(rois);
CHECK_CUDA_INPUT(output);
CHECK_CUDA_INPUT(argmax_y);
CHECK_CUDA_INPUT(argmax_x);
roi_align_forward_cuda(input, rois, output, argmax_y, argmax_x,
aligned_height, aligned_width, spatial_scale,
sampling_ratio, pool_mode, aligned);
#else
AT_ERROR("RoIAlign is not compiled with GPU support");
#endif
} else {
CHECK_CPU_INPUT(input);
CHECK_CPU_INPUT(rois);
CHECK_CPU_INPUT(output);
CHECK_CPU_INPUT(argmax_y);
CHECK_CPU_INPUT(argmax_x);
roi_align_forward_cpu(input, rois, output, argmax_y, argmax_x,
aligned_height, aligned_width, spatial_scale,
sampling_ratio, pool_mode, aligned);
}
Registry and dispatcher are added to manage these implementations.
void ROIAlignForwardCUDAKernelLauncher(Tensor input, Tensor rois, Tensor output,
Tensor argmax_y, Tensor argmax_x,
int aligned_height, int aligned_width,
float spatial_scale, int sampling_ratio,
int pool_mode, bool aligned);
void roi_align_forward_cuda(Tensor input, Tensor rois, Tensor output,
Tensor argmax_y, Tensor argmax_x,
int aligned_height, int aligned_width,
float spatial_scale, int sampling_ratio,
int pool_mode, bool aligned) {
ROIAlignForwardCUDAKernelLauncher(
input, rois, output, argmax_y, argmax_x, aligned_height, aligned_width,
spatial_scale, sampling_ratio, pool_mode, aligned);
}
// register cuda implementation
void roi_align_forward_impl(Tensor input, Tensor rois, Tensor output,
Tensor argmax_y, Tensor argmax_x,
int aligned_height, int aligned_width,
float spatial_scale, int sampling_ratio,
int pool_mode, bool aligned);
REGISTER_DEVICE_IMPL(roi_align_forward_impl, CUDA, roi_align_forward_cuda);
// roi_align.cpp
// use the dispatcher to invoke different implementation depending on device type of input tensors.
void roi_align_forward_impl(Tensor input, Tensor rois, Tensor output,
Tensor argmax_y, Tensor argmax_x,
int aligned_height, int aligned_width,
float spatial_scale, int sampling_ratio,
int pool_mode, bool aligned) {
DISPATCH_DEVICE_IMPL(roi_align_forward_impl, input, rois, output, argmax_y,
argmax_x, aligned_height, aligned_width, spatial_scale,
sampling_ratio, pool_mode, aligned);
}
v1.3.11
In order to flexibly support more backends and hardwares like NVIDIA GPUs and AMD GPUs, the directory of mmcv/ops/csrc is refactored. Note that this refactoring will not affect the usage in API. For related information, please refer to PR1206.
The original directory was organized as follows.
.
βββ common_cuda_helper.hpp
βββ ops_cuda_kernel.cuh
βββ pytorch_cpp_helper.hpp
βββ pytorch_cuda_helper.hpp
βββ parrots_cpp_helper.hpp
βββ parrots_cuda_helper.hpp
βββ parrots_cudawarpfunction.cuh
βββ onnxruntime
β βββ onnxruntime_register.h
β βββ onnxruntime_session_options_config_keys.h
β βββ ort_mmcv_utils.h
β βββ ...
β βββ onnx_ops.h
β βββ cpu
β βββ onnxruntime_register.cpp
β βββ ...
β βββ onnx_ops_impl.cpp
βββ parrots
β βββ ...
β βββ ops.cpp
β βββ ops_cuda.cu
β βββ ops_parrots.cpp
β βββ ops_pytorch.h
βββ pytorch
β βββ ...
β βββ ops.cpp
β βββ ops_cuda.cu
β βββ pybind.cpp
βββ tensorrt
βββ trt_cuda_helper.cuh
βββ trt_plugin_helper.hpp
βββ trt_plugin.hpp
βββ trt_serialize.hpp
βββ ...
βββ trt_ops.hpp
βββ plugins
βββ trt_cuda_helper.cu
βββ trt_plugin.cpp
βββ ...
βββ trt_ops.cpp
βββ trt_ops_kernel.cu
After refactored, it is organized as follows.
.
βββ common
β βββ box_iou_rotated_utils.hpp
β βββ parrots_cpp_helper.hpp
β βββ parrots_cuda_helper.hpp
β βββ pytorch_cpp_helper.hpp
β βββ pytorch_cuda_helper.hpp
β βββ cuda
β βββ common_cuda_helper.hpp
β βββ parrots_cudawarpfunction.cuh
β βββ ...
β βββ ops_cuda_kernel.cuh
βββ onnxruntime
β βββ onnxruntime_register.h
β βββ onnxruntime_session_options_config_keys.h
β βββ ort_mmcv_utils.h
β βββ ...
β βββ onnx_ops.h
β βββ cpu
β βββ onnxruntime_register.cpp
β βββ ...
β βββ onnx_ops_impl.cpp
βββ parrots
β βββ ...
β βββ ops.cpp
β βββ ops_parrots.cpp
β βββ ops_pytorch.h
βββ pytorch
β βββ info.cpp
β βββ pybind.cpp
β βββ ...
β βββ ops.cpp
β βββ cuda
β βββ ...
β βββ ops_cuda.cu
βββ tensorrt
βββ trt_cuda_helper.cuh
βββ trt_plugin_helper.hpp
βββ trt_plugin.hpp
βββ trt_serialize.hpp
βββ ...
βββ trt_ops.hpp
βββ plugins
βββ trt_cuda_helper.cu
βββ trt_plugin.cpp
βββ ...
βββ trt_ops.cpp
βββ trt_ops_kernel.cu