Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,113 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- object-detection
|
| 5 |
+
- region-proposal
|
| 6 |
+
- open-set-detection
|
| 7 |
+
- zero-shot-detection
|
| 8 |
+
- mmdetection
|
| 9 |
+
- pytorch
|
| 10 |
+
- cvpr2026
|
| 11 |
+
datasets:
|
| 12 |
+
- coco
|
| 13 |
+
- imagenet
|
| 14 |
+
- cd-fsod
|
| 15 |
+
- odinw
|
| 16 |
+
metrics:
|
| 17 |
+
- average-recall (AR)
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
# PF-RPN: Prompt-Free Region Proposal Network
|
| 21 |
+
|
| 22 |
+
## 🧠 Model Details
|
| 23 |
+
|
| 24 |
+
**PF-RPN** (Prompt-Free Region Proposal Network) is a state-of-the-art model for Cross-Domain Open-Set Object Detection, accepted at **CVPR 2026**.
|
| 25 |
+
|
| 26 |
+
Open-vocabulary detectors typically rely on text prompts (class names), which can be unavailable, noisy, or domain-sensitive during deployment. PF-RPN tackles this by revisiting region proposal generation under a strictly **prompt-free** setting. Instead of specific category names, all categories are unified into a single token (`object`).
|
| 27 |
+
|
| 28 |
+
### Model Architecture Innovations
|
| 29 |
+
To improve proposal quality without explicit class prompts, PF-RPN introduces three key designs:
|
| 30 |
+
1. **Sparse Image-Aware Adapter:** Constructs pseudo-text representations from multi-level visual features.
|
| 31 |
+
2. **Cascade Self-Prompt:** Iteratively enhances visual-text alignments via masked pooling.
|
| 32 |
+
3. **Centerness-Guided Query Selection:** Selects top-k decoder queries using joint confidence scores.
|
| 33 |
+
|
| 34 |
+
### Model Sources
|
| 35 |
+
- **Repository:** [PF-RPN GitHub Repository](#) *(Insert your GitHub link here)*
|
| 36 |
+
- **Paper:** PF-RPN: Prompt-Free Region Proposal Network for Cross-Domain Open-Set Object Detection (CVPR 2026)
|
| 37 |
+
- **Base Framework:** [MMDetection 3.3.0](https://github.com/open-mmlab/mmdetection)
|
| 38 |
+
- **Backbone:** Swin-Base (`swinb`)
|
| 39 |
+
|
| 40 |
+
## 🎯 Intended Use
|
| 41 |
+
|
| 42 |
+
- **Primary Use Case:** Generating high-quality, class-agnostic region proposals ("objects") across diverse, unseen domains without requiring domain-specific text prompts or retraining.
|
| 43 |
+
- **Protocol:** Strict one-class open-set setup where `custom_classes = ('object',)`.
|
| 44 |
+
|
| 45 |
+
## 🗂️ Training Data
|
| 46 |
+
|
| 47 |
+
The provided checkpoint (`pf_rpn_swinb_5p_coco_imagenet.pth`) was trained on a combined dataset of **COCO 2017** and **ImageNet-1k**.
|
| 48 |
+
- To simulate the open-set proposal generation task, all ground-truth categories are merged into a single class (`object`).
|
| 49 |
+
- The specific released model uses a **5% subset** of the COCO training data merged with ImageNet images.
|
| 50 |
+
|
| 51 |
+
## 📊 Evaluation Data and Performance
|
| 52 |
+
|
| 53 |
+
PF-RPN achieves state-of-the-art Average Recall (AR) under prompt-free evaluation across multiple benchmarks.
|
| 54 |
+
|
| 55 |
+
### Cross-Domain Few-Shot Object Detection (CD-FSOD)
|
| 56 |
+
Evaluated across 6 target domains (ArTaxOr, clipart1k, DIOR, FISH, NEUDET, UODD).
|
| 57 |
+
|
| 58 |
+
| Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
|
| 59 |
+
|---|:---:|---:|---:|---:|---:|---:|---:|
|
| 60 |
+
| GDINO‡ | ✓ | 54.7 | 57.8 | 61.6 | 34.1 | 49.3 | 67.0 |
|
| 61 |
+
| GenerateU | ✓ | 47.7 | 54.1 | 55.7 | 28.1 | 48.3 | 69.4 |
|
| 62 |
+
| Cascade RPN | ✓ | 45.8 | 52.0 | 56.9 | 31.1 | 50.5 | 66.0 |
|
| 63 |
+
| **PF-RPN (Ours)** | **✓** | **60.7** | **65.3** | **68.2** | **38.5** | **61.9** | **80.3** |
|
| 64 |
+
|
| 65 |
+
### Object Detection in the Wild (ODinW13)
|
| 66 |
+
Evaluated across 13 diverse target domains.
|
| 67 |
+
|
| 68 |
+
| Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
|
| 69 |
+
|---|:---:|---:|---:|---:|---:|---:|---:|
|
| 70 |
+
| GDINO‡ | ✓ | 69.1 | 70.9 | 72.4 | 40.8 | 64.6 | 78.4 |
|
| 71 |
+
| GenerateU | ✓ | 67.3 | 71.5 | 72.2 | 32.8 | 63.1 | 80.0 |
|
| 72 |
+
| Cascade RPN | ✓ | 60.9 | 65.5 | 70.2 | 40.3 | 65.5 | 75.0 |
|
| 73 |
+
| **PF-RPN (Ours)** | **✓** | **76.5** | **78.6** | **79.8** | 45.4 | **71.9** | **85.8** |
|
| 74 |
+
|
| 75 |
+
*(‡ indicates models where original class names were replaced with `object` to simulate a prompt-free setting).*
|
| 76 |
+
|
| 77 |
+
## ⚙️ How to Use
|
| 78 |
+
|
| 79 |
+
### Installation
|
| 80 |
+
Ensure you have a working environment with Python 3.10, PyTorch 2.1.0, and CUDA 11.8. Install MMDetection and this repository's codebase as described in the [GitHub README](#).
|
| 81 |
+
|
| 82 |
+
### Quick Start: Evaluation
|
| 83 |
+
|
| 84 |
+
1. **Download the Weights**
|
| 85 |
+
```bash
|
| 86 |
+
mkdir -p checkpoints
|
| 87 |
+
|
| 88 |
+
# Download GroundingDINO base weights
|
| 89 |
+
wget -O checkpoints/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth \
|
| 90 |
+
[https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth)
|
| 91 |
+
|
| 92 |
+
# Download PF-RPN weights
|
| 93 |
+
wget -O checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth \
|
| 94 |
+
[https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth](https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth)
|
| 95 |
+
|
| 96 |
+
2. **Run Inference / Testing**
|
| 97 |
+
```bash
|
| 98 |
+
python tools/test.py \
|
| 99 |
+
configs/pf-rpn/pf-rpn_coco-imagenet.py \
|
| 100 |
+
checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth
|
| 101 |
+
```
|
| 102 |
+
Note: Data preprocessing is required before evaluation. Datasets must be downloaded and their categories merged into a single `object` class using the provided `tools/merge_classes_and_sample_subset.py` script. See the repository for detailed data preparation commands.
|
| 103 |
+
|
| 104 |
+
## 📚 Citation
|
| 105 |
+
If you use PF-RPN in your research, please cite:
|
| 106 |
+
```bibtex
|
| 107 |
+
@inproceedings{tang2026pf,
|
| 108 |
+
title={PF-RPN: Prompt-Free Region Proposal Network for Cross-Domain Open-Set Object Detection},
|
| 109 |
+
author={Tang, Qihong and Liu, Changhan and Zhang, Shaofeng and Li, Wenbin and Fan, Qi and Gao, Yang},
|
| 110 |
+
booktitle={CVPR},
|
| 111 |
+
year={2026}
|
| 112 |
+
}
|
| 113 |
+
```
|