tangqh commited on
Commit
cf696de
·
verified ·
1 Parent(s): 52759ee

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -0
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - object-detection
5
+ - region-proposal
6
+ - open-set-detection
7
+ - zero-shot-detection
8
+ - mmdetection
9
+ - pytorch
10
+ - cvpr2026
11
+ datasets:
12
+ - coco
13
+ - imagenet
14
+ - cd-fsod
15
+ - odinw
16
+ metrics:
17
+ - average-recall (AR)
18
+ ---
19
+
20
+ # PF-RPN: Prompt-Free Region Proposal Network
21
+
22
+ ## 🧠 Model Details
23
+
24
+ **PF-RPN** (Prompt-Free Region Proposal Network) is a state-of-the-art model for Cross-Domain Open-Set Object Detection, accepted at **CVPR 2026**.
25
+
26
+ Open-vocabulary detectors typically rely on text prompts (class names), which can be unavailable, noisy, or domain-sensitive during deployment. PF-RPN tackles this by revisiting region proposal generation under a strictly **prompt-free** setting. Instead of specific category names, all categories are unified into a single token (`object`).
27
+
28
+ ### Model Architecture Innovations
29
+ To improve proposal quality without explicit class prompts, PF-RPN introduces three key designs:
30
+ 1. **Sparse Image-Aware Adapter:** Constructs pseudo-text representations from multi-level visual features.
31
+ 2. **Cascade Self-Prompt:** Iteratively enhances visual-text alignments via masked pooling.
32
+ 3. **Centerness-Guided Query Selection:** Selects top-k decoder queries using joint confidence scores.
33
+
34
+ ### Model Sources
35
+ - **Repository:** [PF-RPN GitHub Repository](#) *(Insert your GitHub link here)*
36
+ - **Paper:** PF-RPN: Prompt-Free Region Proposal Network for Cross-Domain Open-Set Object Detection (CVPR 2026)
37
+ - **Base Framework:** [MMDetection 3.3.0](https://github.com/open-mmlab/mmdetection)
38
+ - **Backbone:** Swin-Base (`swinb`)
39
+
40
+ ## 🎯 Intended Use
41
+
42
+ - **Primary Use Case:** Generating high-quality, class-agnostic region proposals ("objects") across diverse, unseen domains without requiring domain-specific text prompts or retraining.
43
+ - **Protocol:** Strict one-class open-set setup where `custom_classes = ('object',)`.
44
+
45
+ ## 🗂️ Training Data
46
+
47
+ The provided checkpoint (`pf_rpn_swinb_5p_coco_imagenet.pth`) was trained on a combined dataset of **COCO 2017** and **ImageNet-1k**.
48
+ - To simulate the open-set proposal generation task, all ground-truth categories are merged into a single class (`object`).
49
+ - The specific released model uses a **5% subset** of the COCO training data merged with ImageNet images.
50
+
51
+ ## 📊 Evaluation Data and Performance
52
+
53
+ PF-RPN achieves state-of-the-art Average Recall (AR) under prompt-free evaluation across multiple benchmarks.
54
+
55
+ ### Cross-Domain Few-Shot Object Detection (CD-FSOD)
56
+ Evaluated across 6 target domains (ArTaxOr, clipart1k, DIOR, FISH, NEUDET, UODD).
57
+
58
+ | Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
59
+ |---|:---:|---:|---:|---:|---:|---:|---:|
60
+ | GDINO‡ | ✓ | 54.7 | 57.8 | 61.6 | 34.1 | 49.3 | 67.0 |
61
+ | GenerateU | ✓ | 47.7 | 54.1 | 55.7 | 28.1 | 48.3 | 69.4 |
62
+ | Cascade RPN | ✓ | 45.8 | 52.0 | 56.9 | 31.1 | 50.5 | 66.0 |
63
+ | **PF-RPN (Ours)** | **✓** | **60.7** | **65.3** | **68.2** | **38.5** | **61.9** | **80.3** |
64
+
65
+ ### Object Detection in the Wild (ODinW13)
66
+ Evaluated across 13 diverse target domains.
67
+
68
+ | Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
69
+ |---|:---:|---:|---:|---:|---:|---:|---:|
70
+ | GDINO‡ | ✓ | 69.1 | 70.9 | 72.4 | 40.8 | 64.6 | 78.4 |
71
+ | GenerateU | ✓ | 67.3 | 71.5 | 72.2 | 32.8 | 63.1 | 80.0 |
72
+ | Cascade RPN | ✓ | 60.9 | 65.5 | 70.2 | 40.3 | 65.5 | 75.0 |
73
+ | **PF-RPN (Ours)** | **✓** | **76.5** | **78.6** | **79.8** | 45.4 | **71.9** | **85.8** |
74
+
75
+ *(‡ indicates models where original class names were replaced with `object` to simulate a prompt-free setting).*
76
+
77
+ ## ⚙️ How to Use
78
+
79
+ ### Installation
80
+ Ensure you have a working environment with Python 3.10, PyTorch 2.1.0, and CUDA 11.8. Install MMDetection and this repository's codebase as described in the [GitHub README](#).
81
+
82
+ ### Quick Start: Evaluation
83
+
84
+ 1. **Download the Weights**
85
+ ```bash
86
+ mkdir -p checkpoints
87
+
88
+ # Download GroundingDINO base weights
89
+ wget -O checkpoints/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth \
90
+ [https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth)
91
+
92
+ # Download PF-RPN weights
93
+ wget -O checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth \
94
+ [https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth](https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth)
95
+
96
+ 2. **Run Inference / Testing**
97
+ ```bash
98
+ python tools/test.py \
99
+ configs/pf-rpn/pf-rpn_coco-imagenet.py \
100
+ checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth
101
+ ```
102
+ Note: Data preprocessing is required before evaluation. Datasets must be downloaded and their categories merged into a single `object` class using the provided `tools/merge_classes_and_sample_subset.py` script. See the repository for detailed data preparation commands.
103
+
104
+ ## 📚 Citation
105
+ If you use PF-RPN in your research, please cite:
106
+ ```bibtex
107
+ @inproceedings{tang2026pf,
108
+ title={PF-RPN: Prompt-Free Region Proposal Network for Cross-Domain Open-Set Object Detection},
109
+ author={Tang, Qihong and Liu, Changhan and Zhang, Shaofeng and Li, Wenbin and Fan, Qi and Gao, Yang},
110
+ booktitle={CVPR},
111
+ year={2026}
112
+ }
113
+ ```