Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,204 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
tags:
|
| 6 |
+
- computer-vision
|
| 7 |
+
- image-matching
|
| 8 |
+
- overlap-detection
|
| 9 |
+
- feature-extraction
|
| 10 |
+
datasets:
|
| 11 |
+
- SSSSphinx/SCoDe
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# SCoDe: Scale-aware Co-visible Region Detection for Image Matching
|
| 15 |
+
|
| 16 |
+
<div align="center">
|
| 17 |
+
|
| 18 |
+
[](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260)
|
| 19 |
+
[](https://doi.org/10.1016/j.isprsjprs.2025.08.015)
|
| 20 |
+
[](https://xupan.top/Projects/scode)
|
| 21 |
+
[](https://github.com/SSSSphinx/SCoDe)
|
| 22 |
+
|
| 23 |
+
</div>
|
| 24 |
+
|
| 25 |
+
## Overview
|
| 26 |
+
|
| 27 |
+
SCoDe is a scale-aware co-visible region detection model designed for robust image matching. It detects overlapping regions between image pairs while being invariant to scale variations, making it particularly effective for structure-from-motion and 3D reconstruction tasks.
|
| 28 |
+
|
| 29 |
+
This model is built upon the CCOE (Co-visible region detection with Overlap Estimation) architecture and has been trained on the MegaDepth dataset.
|
| 30 |
+
|
| 31 |
+
## Model Details
|
| 32 |
+
|
| 33 |
+
- **Architecture**: CCOE-based transformer with multi-scale attention
|
| 34 |
+
- **Backbone**: ResNet-50
|
| 35 |
+
- **Input Size**: 1024×1024 (configurable)
|
| 36 |
+
- **Training Dataset**: MegaDepth
|
| 37 |
+
- **Framework**: PyTorch
|
| 38 |
+
|
| 39 |
+
### Key Features
|
| 40 |
+
|
| 41 |
+
- Scale-aware overlap region detection
|
| 42 |
+
- Rotation-invariant matching capabilities
|
| 43 |
+
- End-to-end trainable pipeline
|
| 44 |
+
- Compatible with various feature extractors (SIFT, SuperPoint, D2-Net, R2D2, DISK)
|
| 45 |
+
|
| 46 |
+
## Usage
|
| 47 |
+
|
| 48 |
+
### Installation
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
pip install torch torchvision
|
| 52 |
+
git clone https://github.com/SSSSphinx/SCoDe.git
|
| 53 |
+
cd SCoDe
|
| 54 |
+
pip install -r requirements.txt
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
### Quick Start
|
| 58 |
+
|
| 59 |
+
```python
|
| 60 |
+
import torch
|
| 61 |
+
from src.config.default import get_cfg_defaults
|
| 62 |
+
from src.model import CCOE
|
| 63 |
+
|
| 64 |
+
# Load configuration
|
| 65 |
+
cfg = get_cfg_defaults()
|
| 66 |
+
cfg.merge_from_file('configs/scode_config.py')
|
| 67 |
+
|
| 68 |
+
# Initialize model
|
| 69 |
+
device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
| 70 |
+
model = CCOE(cfg.CCOE).eval().to(device)
|
| 71 |
+
|
| 72 |
+
# Load pre-trained weights
|
| 73 |
+
model.load_state_dict(torch.load('weights/scode.pth', map_location=device))
|
| 74 |
+
|
| 75 |
+
# Model is ready for inference
|
| 76 |
+
with torch.no_grad():
|
| 77 |
+
# Process image pair (example)
|
| 78 |
+
image1 = torch.randn(1, 3, 1024, 1024).to(device)
|
| 79 |
+
image2 = torch.randn(1, 3, 1024, 1024).to(device)
|
| 80 |
+
output = model({'image1': image1, 'image2': image2})
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
### Training
|
| 84 |
+
|
| 85 |
+
```bash
|
| 86 |
+
# Single GPU training
|
| 87 |
+
python train_scode.py --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5
|
| 88 |
+
|
| 89 |
+
# Multi-GPU distributed training (4 GPUs)
|
| 90 |
+
python -m torch.distributed.launch --nproc_per_node 4 --master_port=29501 train_scode.py \
|
| 91 |
+
--num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
### Evaluation
|
| 95 |
+
|
| 96 |
+
#### Rotation Invariance Evaluation
|
| 97 |
+
```bash
|
| 98 |
+
python rot_inv_eval.py \
|
| 99 |
+
--extractors superpoint d2net r2d2 disk \
|
| 100 |
+
--image_pairs path/to/image/pairs \
|
| 101 |
+
--output_dir outputs/scode_rot_eval
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
#### Pose Estimation Evaluation
|
| 105 |
+
```bash
|
| 106 |
+
python eval_pose_estimation.py \
|
| 107 |
+
--results_dir outputs/megadepth_results \
|
| 108 |
+
--dataset megadepth
|
| 109 |
+
```
|
| 110 |
+
|
| 111 |
+
#### Radar Evaluation
|
| 112 |
+
```bash
|
| 113 |
+
python eval_radar.py \
|
| 114 |
+
--results_dir outputs/radar_results
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
+
## Configuration
|
| 118 |
+
|
| 119 |
+
Main configuration files:
|
| 120 |
+
- [`configs/scode_config.py`](configs/scode_config.py) - SCoDe model configuration
|
| 121 |
+
- [`src/config/default.py`](src/config/default.py) - Default configuration template
|
| 122 |
+
|
| 123 |
+
### Key Parameters
|
| 124 |
+
|
| 125 |
+
```python
|
| 126 |
+
# Training
|
| 127 |
+
cfg.DATASET.TRAIN.IMAGE_SIZE = [1024, 1024]
|
| 128 |
+
cfg.DATASET.TRAIN.BATCH_SIZE = 4
|
| 129 |
+
cfg.DATASET.TRAIN.PAIRS_LENGTH = 128000
|
| 130 |
+
|
| 131 |
+
# Validation
|
| 132 |
+
cfg.DATASET.VAL.IMAGE_SIZE = [1024, 1024]
|
| 133 |
+
|
| 134 |
+
# Model
|
| 135 |
+
cfg.CCOE.BACKBONE.NUM_LAYERS = 50
|
| 136 |
+
cfg.CCOE.BACKBONE.STRIDE = 32
|
| 137 |
+
cfg.CCOE.CCA.DEPTH = [2, 2, 2, 2]
|
| 138 |
+
cfg.CCOE.CCA.NUM_HEADS = [8, 8, 8, 8]
|
| 139 |
+
```
|
| 140 |
+
|
| 141 |
+
## Dataset
|
| 142 |
+
|
| 143 |
+
The model is trained on the [MegaDepth](https://github.com/zhengqili/MegaDepth) dataset with scale-aware pair generation.
|
| 144 |
+
|
| 145 |
+
Dataset preparation:
|
| 146 |
+
```bash
|
| 147 |
+
python dataset_preparation.py \
|
| 148 |
+
--base_path dataset/megadepth/MegaDepth \
|
| 149 |
+
--num_per_scene 5000
|
| 150 |
+
```
|
| 151 |
+
|
| 152 |
+
Validation pairs are automatically generated and evaluated during training.
|
| 153 |
+
|
| 154 |
+
## Model Performance
|
| 155 |
+
|
| 156 |
+
SCoDe demonstrates strong performance on:
|
| 157 |
+
- **Rotation Invariance**: Robust to image rotations up to 360°
|
| 158 |
+
- **Scale Invariance**: Effective across multiple image scales
|
| 159 |
+
- **Pose Estimation**: Improved camera pose estimation on MegaDepth benchmark
|
| 160 |
+
- **Feature Matching**: Enhanced matching accuracy with various feature extractors
|
| 161 |
+
|
| 162 |
+
## Supported Feature Extractors
|
| 163 |
+
|
| 164 |
+
The model works seamlessly with:
|
| 165 |
+
- SIFT (with brute-force matcher)
|
| 166 |
+
- SuperPoint (with NN matcher)
|
| 167 |
+
- D2-Net
|
| 168 |
+
- R2D2
|
| 169 |
+
- DISK
|
| 170 |
+
|
| 171 |
+
## Citation
|
| 172 |
+
|
| 173 |
+
If you find this project useful in your research, please cite our paper:
|
| 174 |
+
|
| 175 |
+
```bibtex
|
| 176 |
+
@article{pan2025scale,
|
| 177 |
+
title={Scale-aware co-visible region detection for image matching},
|
| 178 |
+
author={Pan, Xu and Xia, Zimin and Zheng, Xianwei},
|
| 179 |
+
journal={ISPRS Journal of Photogrammetry and Remote Sensing},
|
| 180 |
+
volume={229},
|
| 181 |
+
pages={122--137},
|
| 182 |
+
year={2025},
|
| 183 |
+
publisher={Elsevier}
|
| 184 |
+
}
|
| 185 |
+
```
|
| 186 |
+
|
| 187 |
+
## License
|
| 188 |
+
|
| 189 |
+
This project is licensed under the Apache-2.0 License. See the LICENSE file for details.
|
| 190 |
+
|
| 191 |
+
## Acknowledgments
|
| 192 |
+
|
| 193 |
+
- [MegaDepth](https://github.com/zhengqili/MegaDepth) - Dataset and benchmarks
|
| 194 |
+
- [OETR](https://github.com/TencentYoutuResearch/ImageMatching-OETR) - Model initialization strategies
|
| 195 |
+
- PyTorch team for the excellent framework
|
| 196 |
+
|
| 197 |
+
## Contact
|
| 198 |
+
|
| 199 |
+
For questions or issues, please visit the [GitHub repository](https://github.com/SSSSphinx/SCoDe) or contact the authors.
|
| 200 |
+
|
| 201 |
+
---
|
| 202 |
+
|
| 203 |
+
**Paper**: [Scale-aware Co-visible Region Detection for Image Matching](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260)
|
| 204 |
+
**Project Page**: [https://xupan.top/Projects/scode](https://xupan.top/Projects/scode)
|