Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,137 @@
|
|
| 1 |
-
--
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[](https://arxiv.org/abs/2204.12463)
|
| 2 |
+

|
| 3 |
+
|
| 4 |
+
|
| 5 |
+
# Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)
|
| 6 |
+
|
| 7 |
+
This is the official implementation of ***Focals Conv*** (CVPR 2022), a new sparse convolution design for 3D object detection (feasible for both lidar-only and multi-modal settings). For more details, please refer to:
|
| 8 |
+
|
| 9 |
+
**Focal Sparse Convolutional Networks for 3D Object Detection [[Paper](https://arxiv.org/abs/2204.12463)]** <br />
|
| 10 |
+
Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia<br />
|
| 11 |
+
|
| 12 |
+
<p align="center"> <img src="docs/imgs/FocalSparseConv23D.png" width="100%"> </p>
|
| 13 |
+
|
| 14 |
+
<p align="center"> <img src="docs/imgs/FocalSparseConv_Pipeline.png" width="100%"> </p>
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
### Experimental results
|
| 18 |
+
|
| 19 |
+
#### KITTI dataset
|
| 20 |
+
| | Car@R11 | Car@R40 |download |
|
| 21 |
+
|---------------------------------------------|-------:|:-------:|:---------:|
|
| 22 |
+
| [PV-RCNN + Focals Conv](OpenPCDet/tools/cfgs/kitti_models/pv_rcnn_focal_lidar.yaml) | 83.91 | 85.20 | [Google](https://drive.google.com/file/d/1XOpIzHKtkEj9BNrQR6VYADO_T5yaOiJq/view?usp=sharing) \| [Baidu](https://pan.baidu.com/s/1t1Gk8bDv8Q_Dd5vB4VtChA) (key: m15b) |
|
| 23 |
+
| [PV-RCNN + Focals Conv (multimodal)](OpenPCDet/tools/cfgs/kitti_models/pv_rcnn_focal_multimodal.yaml) | 84.58 | 85.34 | [Google](https://drive.google.com/file/d/183araPcEmYSlruife2nszKeJv1KH2spg/view?usp=sharing) \| [Baidu](https://pan.baidu.com/s/10XodrSazMFDFnTRdKIfbKA) (key: ie6n) |
|
| 24 |
+
| [Voxel R-CNN (Car) + Focals Conv (multimodal)](OpenPCDet/tools/cfgs/kitti_models/voxel_rcnn_car_focal_multimodal.yaml) | 85.68 | 86.00 | [Google](https://drive.google.com/file/d/1M7IUosz4q4qHKEZeRLIIBQ6Wj1-0Wjdg/view?usp=sharing) \| [Baidu](https://pan.baidu.com/s/1bIN3zDmPXrURMOPg7pukzA) (key: tnw9) |
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
#### nuScenes dataset
|
| 28 |
+
|
| 29 |
+
| | mAP | NDS | download |
|
| 30 |
+
|---------------------------------------------|----------:|:-------:|:---------:|
|
| 31 |
+
| [CenterPoint + Focals Conv (multi-modal)](CenterPoint/configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal.py) | 63.86 | 69.41 | [Google](https://drive.google.com/file/d/12VXMl6RQcz87OWPxXJsB_Nb0MdimsTiG/view?usp=sharing) \| [Baidu](https://pan.baidu.com/s/1ZXn-fhmeL6AsveV2G3n5Jg) (key: 01jh) |
|
| 32 |
+
| [CenterPoint + Focals Conv (multi-modal) - 1/4 data](CenterPoint/configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal_1_4_data.py) | 62.15 | 67.45 | [Google](https://drive.google.com/file/d/1HC3nTEE8GVhInquwRd9hRJPSsZZylR58/view?usp=sharing) \| [Baidu](https://pan.baidu.com/s/1tKlO4GgzjXojzjzpoJY_Ng) (key: 6qsc) |
|
| 33 |
+
|
| 34 |
+
Visualization of voxel distribution of Focals Conv on KITTI val dataset:
|
| 35 |
+
<p align="center"> <img src="docs/imgs/Sparsity_comparison_3pairs.png" width="100%"> </p>
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
## Getting Started
|
| 40 |
+
### Installation
|
| 41 |
+
|
| 42 |
+
#### a. Clone this repository
|
| 43 |
+
```shell
|
| 44 |
+
https://github.com/dvlab-research/FocalsConv && cd FocalsConv
|
| 45 |
+
```
|
| 46 |
+
#### b. Install the environment
|
| 47 |
+
|
| 48 |
+
Following the install documents for [OpenPCdet](OpenPCDet/docs/INSTALL.md) and [CenterPoint](CenterPoint/docs/INSTALL.md) codebases respectively, based on your preference.
|
| 49 |
+
|
| 50 |
+
*spconv 2.x is highly recommended instead of spconv 1.x version.
|
| 51 |
+
|
| 52 |
+
#### c. Prepare the datasets.
|
| 53 |
+
|
| 54 |
+
Download and organize the official [KITTI](OpenPCDet/docs/GETTING_STARTED.md) and [Waymo](OpenPCDet/docs/GETTING_STARTED.md) following the document in OpenPCdet, and [nuScenes](CenterPoint/docs/NUSC.md) from the CenterPoint codebase.
|
| 55 |
+
|
| 56 |
+
*Note that for nuScenes dataset, we use image-level gt-sampling (copy-paste) in the multi-modal training.
|
| 57 |
+
Please download this [dbinfos_train_10sweeps_withvelo.pkl](https://drive.google.com/file/d/1ypJKpZifM-NsGdUSLMFpBo-KaXlfpplR/view?usp=sharing) to replace the original one. ([Google](https://drive.google.com/file/d/1ypJKpZifM-NsGdUSLMFpBo-KaXlfpplR/view?usp=sharing) \| [Baidu](https://pan.baidu.com/s/1iz1KWthc1XhXG3du3QG__w) (key: b466))
|
| 58 |
+
|
| 59 |
+
*Note that for nuScenes dataset, we conduct ablation studies on a 1/4 data training split.
|
| 60 |
+
Please download [infos_train_mini_1_4_10sweeps_withvelo_filter_True.pkl](https://drive.google.com/file/d/19-Zo8o0OWZYed0UpnOfDqTY5oLXKJV9Q/view?usp=sharing) if you needed for training. ([Google](https://drive.google.com/file/d/19-Zo8o0OWZYed0UpnOfDqTY5oLXKJV9Q/view?usp=sharing) \| [Baidu](https://pan.baidu.com/s/1VbkNBs155JyJLhNtSlbEGQ) (key: 769e))
|
| 61 |
+
|
| 62 |
+
#### d. Download pre-trained models.
|
| 63 |
+
If you want to directly evaluate the trained models we provide, please download them first.
|
| 64 |
+
|
| 65 |
+
If you want to train by yourselvef, for multi-modal settings, please download this resnet pre-train model first,
|
| 66 |
+
[torchvision-res50-deeplabv3](https://download.pytorch.org/models/deeplabv3_resnet50_coco-cd0a2569.pth).
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
### Evaluation
|
| 70 |
+
We provide the trained weight file so you can just run with that. You can also use the model you trained.
|
| 71 |
+
|
| 72 |
+
For models in OpenPCdet,
|
| 73 |
+
```shell
|
| 74 |
+
NUM_GPUS=8
|
| 75 |
+
cd tools
|
| 76 |
+
bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/voxel_rcnn_car_focal_multimodal.yaml --ckpt path/to/voxelrcnn_focal_multimodal.pth
|
| 77 |
+
|
| 78 |
+
bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/pv_rcnn_focal_multimodal.yaml --ckpt ../pvrcnn_focal_multimodal.pth
|
| 79 |
+
|
| 80 |
+
bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/pv_rcnn_focal_lidar.yaml --ckpt path/to/pvrcnn_focal_lidar.pth
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
For models in CenterPoint,
|
| 84 |
+
```shell
|
| 85 |
+
CONFIG="nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal"
|
| 86 |
+
python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/dist_test.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/$CONFIG --checkpoint centerpoint_focal_multimodal.pth
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
### Training
|
| 91 |
+
|
| 92 |
+
For configures in OpenPCdet,
|
| 93 |
+
```shell
|
| 94 |
+
bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/CONFIG.yaml
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
For configures in CenterPoint,
|
| 98 |
+
```shell
|
| 99 |
+
python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/train.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/CONFIG
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
* Note that we use 8 GPUs to train OpenPCdet models and 4 GPUs to train CenterPoint models.
|
| 103 |
+
|
| 104 |
+
## TODO List
|
| 105 |
+
- - [ ] Config files and trained models on the overall Waymo dataset.
|
| 106 |
+
- - [ ] Config files and scripts for the test augs (double-flip and rotation) in nuScenes test submission.
|
| 107 |
+
- - [ ] Results and models of Focals Conv Networks on 3D Segmentation datasets.
|
| 108 |
+
|
| 109 |
+
|
| 110 |
+
## Citation
|
| 111 |
+
If you find this project useful in your research, please consider citing:
|
| 112 |
+
|
| 113 |
+
```
|
| 114 |
+
@inproceedings{focalsconv-chen,
|
| 115 |
+
title={Focal Sparse Convolutional Networks for 3D Object Detection},
|
| 116 |
+
author={Chen, Yukang and Li, Yanwei and Zhang, Xiangyu and Sun, Jian and Jia, Jiaya},
|
| 117 |
+
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
|
| 118 |
+
year={2022}
|
| 119 |
+
}
|
| 120 |
+
```
|
| 121 |
+
|
| 122 |
+
## Acknowledgement
|
| 123 |
+
- This work is built upon the `OpenPCDet` and `CenterPoint`. Please refer to the official github repositories, [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) and [CenterPoint](https://github.com/tianweiy/CenterPoint) for more information.
|
| 124 |
+
|
| 125 |
+
- This README follows the style of [IA-SSD](https://github.com/yifanzhang713/IA-SSD).
|
| 126 |
+
|
| 127 |
+
|
| 128 |
+
|
| 129 |
+
## License
|
| 130 |
+
|
| 131 |
+
This project is released under the [Apache 2.0 license](LICENSE).
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
## Related Repos
|
| 135 |
+
1. [spconv](https://github.com/traveller59/spconv) 
|
| 136 |
+
2. [Deformable Conv](https://github.com/msracver/Deformable-ConvNets) 
|
| 137 |
+
3. [Submanifold Sparse Conv](https://github.com/facebookresearch/SparseConvNet) 
|