SSSSphinx commited on
Commit
cec16f4
·
verified ·
1 Parent(s): 4336881

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +204 -3
README.md CHANGED
@@ -1,3 +1,204 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - computer-vision
7
+ - image-matching
8
+ - overlap-detection
9
+ - feature-extraction
10
+ datasets:
11
+ - SSSSphinx/SCoDe
12
+ ---
13
+
14
+ # SCoDe: Scale-aware Co-visible Region Detection for Image Matching
15
+
16
+ <div align="center">
17
+
18
+ [![Paper](https://img.shields.io/badge/Paper-ScienceDirect-green)](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260)
19
+ [![DOI](https://img.shields.io/badge/DOI-10.1016%2Fj.isprsjprs.2025.08.015-orange)](https://doi.org/10.1016/j.isprsjprs.2025.08.015)
20
+ [![Project Page](https://img.shields.io/badge/Project-Website-blue)](https://xupan.top/Projects/scode)
21
+ [![GitHub](https://img.shields.io/badge/Code-GitHub-black)](https://github.com/SSSSphinx/SCoDe)
22
+
23
+ </div>
24
+
25
+ ## Overview
26
+
27
+ SCoDe is a scale-aware co-visible region detection model designed for robust image matching. It detects overlapping regions between image pairs while being invariant to scale variations, making it particularly effective for structure-from-motion and 3D reconstruction tasks.
28
+
29
+ This model is built upon the CCOE (Co-visible region detection with Overlap Estimation) architecture and has been trained on the MegaDepth dataset.
30
+
31
+ ## Model Details
32
+
33
+ - **Architecture**: CCOE-based transformer with multi-scale attention
34
+ - **Backbone**: ResNet-50
35
+ - **Input Size**: 1024×1024 (configurable)
36
+ - **Training Dataset**: MegaDepth
37
+ - **Framework**: PyTorch
38
+
39
+ ### Key Features
40
+
41
+ - Scale-aware overlap region detection
42
+ - Rotation-invariant matching capabilities
43
+ - End-to-end trainable pipeline
44
+ - Compatible with various feature extractors (SIFT, SuperPoint, D2-Net, R2D2, DISK)
45
+
46
+ ## Usage
47
+
48
+ ### Installation
49
+
50
+ ```bash
51
+ pip install torch torchvision
52
+ git clone https://github.com/SSSSphinx/SCoDe.git
53
+ cd SCoDe
54
+ pip install -r requirements.txt
55
+ ```
56
+
57
+ ### Quick Start
58
+
59
+ ```python
60
+ import torch
61
+ from src.config.default import get_cfg_defaults
62
+ from src.model import CCOE
63
+
64
+ # Load configuration
65
+ cfg = get_cfg_defaults()
66
+ cfg.merge_from_file('configs/scode_config.py')
67
+
68
+ # Initialize model
69
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
70
+ model = CCOE(cfg.CCOE).eval().to(device)
71
+
72
+ # Load pre-trained weights
73
+ model.load_state_dict(torch.load('weights/scode.pth', map_location=device))
74
+
75
+ # Model is ready for inference
76
+ with torch.no_grad():
77
+ # Process image pair (example)
78
+ image1 = torch.randn(1, 3, 1024, 1024).to(device)
79
+ image2 = torch.randn(1, 3, 1024, 1024).to(device)
80
+ output = model({'image1': image1, 'image2': image2})
81
+ ```
82
+
83
+ ### Training
84
+
85
+ ```bash
86
+ # Single GPU training
87
+ python train_scode.py --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5
88
+
89
+ # Multi-GPU distributed training (4 GPUs)
90
+ python -m torch.distributed.launch --nproc_per_node 4 --master_port=29501 train_scode.py \
91
+ --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5
92
+ ```
93
+
94
+ ### Evaluation
95
+
96
+ #### Rotation Invariance Evaluation
97
+ ```bash
98
+ python rot_inv_eval.py \
99
+ --extractors superpoint d2net r2d2 disk \
100
+ --image_pairs path/to/image/pairs \
101
+ --output_dir outputs/scode_rot_eval
102
+ ```
103
+
104
+ #### Pose Estimation Evaluation
105
+ ```bash
106
+ python eval_pose_estimation.py \
107
+ --results_dir outputs/megadepth_results \
108
+ --dataset megadepth
109
+ ```
110
+
111
+ #### Radar Evaluation
112
+ ```bash
113
+ python eval_radar.py \
114
+ --results_dir outputs/radar_results
115
+ ```
116
+
117
+ ## Configuration
118
+
119
+ Main configuration files:
120
+ - [`configs/scode_config.py`](configs/scode_config.py) - SCoDe model configuration
121
+ - [`src/config/default.py`](src/config/default.py) - Default configuration template
122
+
123
+ ### Key Parameters
124
+
125
+ ```python
126
+ # Training
127
+ cfg.DATASET.TRAIN.IMAGE_SIZE = [1024, 1024]
128
+ cfg.DATASET.TRAIN.BATCH_SIZE = 4
129
+ cfg.DATASET.TRAIN.PAIRS_LENGTH = 128000
130
+
131
+ # Validation
132
+ cfg.DATASET.VAL.IMAGE_SIZE = [1024, 1024]
133
+
134
+ # Model
135
+ cfg.CCOE.BACKBONE.NUM_LAYERS = 50
136
+ cfg.CCOE.BACKBONE.STRIDE = 32
137
+ cfg.CCOE.CCA.DEPTH = [2, 2, 2, 2]
138
+ cfg.CCOE.CCA.NUM_HEADS = [8, 8, 8, 8]
139
+ ```
140
+
141
+ ## Dataset
142
+
143
+ The model is trained on the [MegaDepth](https://github.com/zhengqili/MegaDepth) dataset with scale-aware pair generation.
144
+
145
+ Dataset preparation:
146
+ ```bash
147
+ python dataset_preparation.py \
148
+ --base_path dataset/megadepth/MegaDepth \
149
+ --num_per_scene 5000
150
+ ```
151
+
152
+ Validation pairs are automatically generated and evaluated during training.
153
+
154
+ ## Model Performance
155
+
156
+ SCoDe demonstrates strong performance on:
157
+ - **Rotation Invariance**: Robust to image rotations up to 360°
158
+ - **Scale Invariance**: Effective across multiple image scales
159
+ - **Pose Estimation**: Improved camera pose estimation on MegaDepth benchmark
160
+ - **Feature Matching**: Enhanced matching accuracy with various feature extractors
161
+
162
+ ## Supported Feature Extractors
163
+
164
+ The model works seamlessly with:
165
+ - SIFT (with brute-force matcher)
166
+ - SuperPoint (with NN matcher)
167
+ - D2-Net
168
+ - R2D2
169
+ - DISK
170
+
171
+ ## Citation
172
+
173
+ If you find this project useful in your research, please cite our paper:
174
+
175
+ ```bibtex
176
+ @article{pan2025scale,
177
+ title={Scale-aware co-visible region detection for image matching},
178
+ author={Pan, Xu and Xia, Zimin and Zheng, Xianwei},
179
+ journal={ISPRS Journal of Photogrammetry and Remote Sensing},
180
+ volume={229},
181
+ pages={122--137},
182
+ year={2025},
183
+ publisher={Elsevier}
184
+ }
185
+ ```
186
+
187
+ ## License
188
+
189
+ This project is licensed under the Apache-2.0 License. See the LICENSE file for details.
190
+
191
+ ## Acknowledgments
192
+
193
+ - [MegaDepth](https://github.com/zhengqili/MegaDepth) - Dataset and benchmarks
194
+ - [OETR](https://github.com/TencentYoutuResearch/ImageMatching-OETR) - Model initialization strategies
195
+ - PyTorch team for the excellent framework
196
+
197
+ ## Contact
198
+
199
+ For questions or issues, please visit the [GitHub repository](https://github.com/SSSSphinx/SCoDe) or contact the authors.
200
+
201
+ ---
202
+
203
+ **Paper**: [Scale-aware Co-visible Region Detection for Image Matching](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260)
204
+ **Project Page**: [https://xupan.top/Projects/scode](https://xupan.top/Projects/scode)