SRPose / README.md
FrickYinn's picture
Update README.md
ea7acab verified
---
license: apache-2.0
language:
- en
---
# SRPose: Two-view Relative Pose Estimation with Sparse Keypoints
**SRPose**: A **S**parse keypoint-based framework for **R**elative **Pose** estimation between two views in both camera-to-world and object-to-camera scenarios.
| Reference | Query | Ground Truth |
|:--------:|:---------:|:--------:|
| ![](assets/figures/scene5_vis_0.png) | ![](assets/figures/scene5_vis_1.png) | ![](assets/figures/scene5_vis_gt.png) |
| ![](assets/figures/obj_vis_reference_labeled.png) | ![](assets/figures/obj_vis_query.png) |![](assets/figures/obj_vis_gt.png)|
## [Project page](https://frickyinn.github.io/srpose/) | [arXiv](https://arxiv.org/abs/2407.08199)
## Setup
Please first intall PyTorch according to [here](https://pytorch.org/get-started/locally/), then install other dependencies using pip:
```
cd SRPose
pip install -r requirements.txt
```
## Evaluation
1. Download pretrained models [here](https://drive.google.com/drive/folders/1bBlds3UX7-XDCevbIl4bnnywvWzzP5nN) for evaluation.
2. Create new folders:
```
mkdir checkpoints & mkdir data
```
3. Organize the downloaded checkpoints like this:
```
SRPose
|-- checkpoints
|-- ho3d.ckpt
|-- linemod.ckpt
|-- mapfree.ckpt
|-- matterport.ckpt
|-- megadepth.ckpt
`-- scannet.ckpt
...
```
### Matterport
1. Download Matterport dataset [here](https://github.com/jinlinyi/SparsePlanes/blob/main/docs/data.md), only `mp3d_planercnn_json.zip` and `rgb.zip` are required.
2. Unzip and organize the downloaded files:
```
mkdir data/mp3d
mkdir data/mp3d/mp3d_planercnn_json & mkdir data/mp3d/rgb
unzip <pathto>/mp3d_planercnn_json.zip -d data/mp3d/mp3d_planercnn_json
unzip <pathto>/rgb.zip -d data/mp3d/rgb
```
3. The resulted directory tree should be like this:
```
SRPose
|-- data
|-- mp3d
|-- mp3d_planercnn_json
| |-- cached_set_test.json
| |-- cached_set_train.json
| `-- cached_set_val.json
`-- rgb
|-- 17DRP5sb8fy
...
...
...
```
4. Evaluate with the following command:
```
python eval.py configs/matterport.yaml checkpoints/matterport.ckpt
```
### ScanNet & MegaDepth
1. Download and organize the ScanNet-1500 and MegaDepth-1500 test sets according to the [LoFTR Training Script](https://github.com/zju3dv/LoFTR/blob/master/docs/TRAINING.md). Note that only the test sets and the dataset indices are required.
2. The resulted directory tree should be:
```
SRPose
|-- data
|-- scannet
| |-- index
| |-- test
| `-- train (optional)
|-- megadepth
|-- index
|-- test
`-- train (optional)
...
...
```
3. Evaluate with the following commands:
```
python eval.py configs/scannet.yaml checkpoints/scannet.ckpt
python eval.py configs/megadepth.yaml checkpoints/megedepth.ckpt
```
### HO3D
1. Download HO3D (version 3) dataset [here](https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/hand-object-3d-pose-annotation/), `HO3D_v3.zip` and `HO3D_v3_segmentations_rendered.zip` are required.
2. Unzip and organize the downloaded files:
```
mkdir data/ho3d
unzip <pathto>/HO3D_v3.zip -d data/ho3d
unzip <pathto>/HO3D_v3_segmentations_rendered.zip -d data/ho3d
```
3. Evaluate with the following commands:
```
python eval.py configs/ho3d.yaml checkpoints/ho3d.ckpt
python eval_add_reproj.py configs/ho3d.yaml checkpoints/ho3d.ckpt
```
### Linemod
1. Download Linemod dataset [here](https://bop.felk.cvut.cz/datasets/) or run the following commands:
```
cd data
export SRC=https://bop.felk.cvut.cz/media/data/bop_datasets
wget $SRC/lm_base.zip # Base archive with dataset info, camera parameters, etc.
wget $SRC/lm_models.zip # 3D object models.
wget $SRC/lm_test_all.zip # All test images ("_bop19" for a subset used in the BOP Challenge 2019/2020).
wget $SRC/lm_train_pbr.zip # PBR training images (rendered with BlenderProc4BOP).
unzip lm_base.zip # Contains folder "lm".
unzip lm_models.zip -d lm # Unpacks to "lm".
unzip lm_test_all.zip -d lm # Unpacks to "lm".
unzip lm_train_pbr.zip -d lm # Unpacks to "lm".
```
2. Evaluate with the following commands:
```
python eval.py configs/linemod.yaml checkpoints/linemod.ckpt
python eval_add_reproj.py configs/linemod.yaml checkpoints/linemod.ckpt
```
### Niantic
1. Download Niantic dataset [here](https://research.nianticlabs.com/mapfree-reloc-benchmark/dataset).
2. Unzip and organize the downloaded files:
```
mkdir data/mapfree
unzip <pathto>/train.zip -d data/mapfree
unzip <pathto>/val.zip -d data/mapfree
unzip <pathto>/test.zip -d data/mapfree
```
3. The ground truth of the test set is not publicly available, but you can run the following command to produce a new submission file and submit it on the [project page](https://research.nianticlabs.com/mapfree-reloc-benchmark/submit) for evaluation:
```
python eval_add_reproj.py configs/mapfree.yaml checkpoints/mapfree.ckpt
```
You should be able to find a `new_submission.zip` in `SRPose/assets/` afterwards, or you can submit the already produced file `SRPose/assets/mapfree_submission.zip` instead.
## Training
Download and organize the datasets following [Evaluation](#evaluation), then run the following command for training:
```
python train.py configs/<dataset>.yaml
```
Please refer to the `.yaml` files in `SRPose/configs/` for detailed configurations.
## Baselines
We also offer two publicly available matcher-based baselines, [LightGlue](https://github.com/cvg/LightGlue) and [LoFTR](https://github.com/zju3dv/LoFTR), for evaluation and comparison.
Just run the following commands:
```
# For Matterport, ScanNet and MegaDepth
python eval_baselines.py configs/<dataset>.yaml lightglue
python eval_baselines.py configs/<dataset>.yaml loftr
# For HO3D and Linemod
python eval_baselines.py configs/<dataset>.yaml lightglue --resize 640 --depth
python eval_baselines.py configs/<dataset>.yaml loftr --resize 640 --depth
```
The `--resize xx` option controls the larger dimension of cropped target object images that will be resized to.
The `--depth` option controls whether the depth maps will be used to obtain scaled pose estimation.
## Acknowledgements
In this repository, we have used codes from the following repositories. We thank all the authors for sharing great codes.
- [LightGlue](https://github.com/cvg/LightGlue)
- [LoFTR](https://github.com/zju3dv/LoFTR)
- [8point](https://github.com/crockwell/rel_pose)
- [SparsePlanes](https://github.com/jinlinyi/SparsePlanes/tree/main)
- [Map-free](https://github.com/nianticlabs/map-free-reloc/tree/main)
## Citation
```
@inproceedings{yin2024srpose,
title={SRPose: Two-view Relative Pose Estimation with Sparse Keypoints},
author={Yin, Rui and Zhang, Yulun and Pan, Zherong and Zhu, Jianjun and Wang, Cheng and Jia, Biao},
booktitle={ECCV},
year={2024}
}
```