Update model card: add pipeline tag and fix paper links

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +13 -122
README.md CHANGED
@@ -1,10 +1,16 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
 
4
  # SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation
5
 
6
  <div align="center">
7
- <a href="https://arxiv.org/abs/xxxx.xxxxx"><img src="https://img.shields.io/badge/arXiv-Coming_Soon-b31b1b?style=flat-square" alt="arXiv"></a>
8
  <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Checkpoint-yellow?style=flat-square" alt="HF Checkpoint"></a>
9
  <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Dataset-orange?style=flat-square" alt="HF Dataset"></a>
10
  <img src="https://img.shields.io/badge/License-Apache--2.0-green?style=flat-square" alt="License">
@@ -29,7 +35,7 @@ license: apache-2.0
29
 
30
  ## Overview
31
 
32
- This repository provides the codebase of **SSP-SAM**, a referring expression segmentation framework built on top of SAM with semantic-spatial prompts.
33
 
34
  Current repo status:
35
  - Training/testing/data processing scripts are available.
@@ -48,7 +54,8 @@ Current repo status:
48
 
49
  ## 🔗 Model Zoo & Links
50
 
51
- - Paper: `https://arxiv.org/abs/xxxx.xxxxx`
 
52
  - <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="HF" width="16"/> Hugging Face Checkpoints/datasets: `https://huggingface.co/wayneicloud/SSP-SAM`
53
 
54
  ## 📁 Project Structure
@@ -108,104 +115,7 @@ You have two options:
108
  ```
109
 
110
  2. **Regenerate annotations/masks by yourself**
111
- See the collapsible section below.
112
-
113
- <details>
114
- <summary>Generate Annotations/Masks by Yourself (click to expand)</summary>
115
-
116
- References:
117
- - `data_seg/README.md`
118
- - `data_seg/run.sh`
119
- - `legacy_data_prep_simrec.md` (legacy reference for raw data preparation and sources)
120
-
121
- Required raw annotation folders/files for generation include (examples):
122
- - `data_seg/refcoco/`
123
- - `data_seg/refcoco+/`
124
- - `data_seg/refcocog/`
125
- - `data_seg/refclef/`
126
-
127
- Each folder should contain raw files such as `instances.json` and `refs(...).p`.
128
-
129
- Minimal expected layout (example):
130
-
131
- ```text
132
- data_seg/
133
- ├── refcoco/
134
- │ ├── instances.json
135
- │ ├── refs(unc).p
136
- │ └── refs(google).p
137
- ├── refcoco+/
138
- │ ├── instances.json
139
- │ └── refs(unc).p
140
- ├── refcocog/
141
- │ ├── instances.json
142
- │ ├── refs(google).p
143
- │ └── refs(umd).p
144
- └── refclef/
145
- ├── instances.json
146
- ├── refs(unc).p
147
- └── refs(berkeley).p
148
- ```
149
-
150
- Example preprocessing command:
151
-
152
- ```bash
153
- python ./data_seg/data_process.py \
154
- --data_root ./data_seg \
155
- --output_dir ./data_seg \
156
- --dataset refcoco \
157
- --split unc \
158
- --generate_mask
159
- ```
160
-
161
- </details>
162
-
163
- Detailed dataset path/config settings are defined in the corresponding preprocessing scripts/config files in `data_seg/`.
164
- Please modify them according to your local environment before running.
165
- Also check dataset/image path settings in:
166
- - `datasets/dataset.py`
167
-
168
- > Important: in `datasets/dataset.py`, class `VGDataset`, you should update local paths for images/annotations/masks according to your machine.
169
-
170
- Example local data organization:
171
-
172
- ```text
173
- your_project_root/
174
- ├── data/ # set --data_root to this folder
175
- │ ├── coco/
176
- │ │ └── train2014/ # COCO images (unc/unc+/gref/gref_umd/grefcoco)
177
- │ ├── referit/
178
- │ │ └── images/ # ReferIt images
179
- │ ├── VG/ # Visual Genome images (merge pretrain path)
180
- │ └── vg/ # Visual Genome images (phrase_cut path, if used)
181
- └── data_seg/ # same level as data/
182
- ├── anns/
183
- │ ├── refcoco.json
184
- │ ├── refcoco+.json
185
- │ ├── refcocog_umd.json
186
- │ ├── refclef.json
187
- │ └── grefcoco.json
188
- └── masks/
189
- ├── refcoco/
190
- ├── refcoco+/
191
- ├── refcocog_umd/
192
- ├── refclef/
193
- └── grefcoco/
194
- ```
195
-
196
- For training/testing, use:
197
- - `data_seg/anns/*.json` (provided)
198
- - `data_seg/masks/*` (generated locally via `bash data_seg/run.sh`)
199
-
200
- ### Required Images and Raw Data Sources
201
-
202
- For training/evaluation, you need the corresponding image files locally (COCO/Flickr/ReferIt/VG depending on dataset split and config).
203
- Common sources:
204
- - RefCOCO / RefCOCO+ / RefCOCOg / RefClef annotations: http://bvisionweb1.cs.unc.edu/licheng/referit/data/
205
- - MS COCO 2014 images: https://cocodataset.org/
206
- - Flickr30k images: http://shannon.cs.illinois.edu/DenotationGraph/
207
- - ReferItGame images: due to original dataset restrictions, please download by yourself from the official/authorized source.
208
- - Visual Genome images: https://visualgenome.org/
209
 
210
  ## 🚀 Training
211
 
@@ -215,13 +125,6 @@ Default training launcher:
215
  bash submit_train.sh
216
  ```
217
 
218
- `submit_train.sh` already includes commented examples for multiple datasets, e.g.:
219
- - `refcoco`
220
- - `refcoco+`
221
- - `refcocog_umd`
222
- - `referit`
223
- - `grefcoco`
224
-
225
  You can also run directly:
226
 
227
  ```bash
@@ -233,7 +136,7 @@ torchrun --nproc_per_node=8 train.py \
233
  ### Resume Modes
234
 
235
  `train.py` supports two resume modes:
236
- - `--resume <ckpt>`: use this for interrupted training and continue from the previous checkpoint (断点续训).
237
  - `--resume_from_pretrain <ckpt>`: use this for loading pretrained weights before fine-tuning/training.
238
 
239
  ## 📊 Evaluation
@@ -254,26 +157,14 @@ torchrun --nproc_per_node=1 --master_port=29590 test.py \
254
  --checkpoint output/your_save_folder/checkpoint_best_miou.pth
255
  ```
256
 
257
- ## 📝 Notes
258
-
259
- - COCO image path in visualization prioritizes `data/coco/train2014`.
260
- - Current mask prediction/evaluation path uses `512x512` mask space.
261
- - Config files in `configs/` are set with:
262
- - `output_dir='outputs/your_save_folder'`
263
- - `batch_size=8`
264
- - `freeze_epochs=20`
265
-
266
  ## 🌈 Acknowledgements
267
 
268
  This repository benefits from ideas and/or codebases of the following projects:
269
-
270
  - SimREC: https://github.com/luogen1996/SimREC
271
  - gRefCOCO: https://github.com/henghuiding/gRefCOCO
272
  - TransVG: https://github.com/djiajunustc/TransVG
273
  - Segment Anything (SAM): https://github.com/facebookresearch/segment-anything
274
 
275
- Thanks to the authors for their valuable open-source contributions.
276
-
277
  ## 📚 Citation
278
 
279
  If you find this repository useful, please cite our SSP-SAM paper.
@@ -285,4 +176,4 @@ If you find this repository useful, please cite our SSP-SAM paper.
285
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
286
  year={2025}
287
  }
288
- ```
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-segmentation
4
+ tags:
5
+ - referring-expression-segmentation
6
+ - sam
7
+ - gres
8
  ---
9
+
10
  # SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation
11
 
12
  <div align="center">
13
+ <a href="https://arxiv.org/abs/2603.18086"><img src="https://img.shields.io/badge/arXiv-2603.18086-b31b1b?style=flat-square" alt="arXiv"></a>
14
  <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Checkpoint-yellow?style=flat-square" alt="HF Checkpoint"></a>
15
  <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Dataset-orange?style=flat-square" alt="HF Dataset"></a>
16
  <img src="https://img.shields.io/badge/License-Apache--2.0-green?style=flat-square" alt="License">
 
35
 
36
  ## Overview
37
 
38
+ This repository provides the codebase of **SSP-SAM**, a referring expression segmentation framework built on top of SAM with semantic-spatial prompts. The model is presented in the paper [SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation](https://arxiv.org/abs/2603.18086).
39
 
40
  Current repo status:
41
  - Training/testing/data processing scripts are available.
 
54
 
55
  ## 🔗 Model Zoo & Links
56
 
57
+ - Paper: [SSP-SAM (arXiv:2603.18086)](https://arxiv.org/abs/2603.18086)
58
+ - Code: [GitHub - WayneTomas/SSP-SAM](https://github.com/WayneTomas/SSP-SAM)
59
  - <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="HF" width="16"/> Hugging Face Checkpoints/datasets: `https://huggingface.co/wayneicloud/SSP-SAM`
60
 
61
  ## 📁 Project Structure
 
115
  ```
116
 
117
  2. **Regenerate annotations/masks by yourself**
118
+ See the collapsible section below in the [GitHub repository](https://github.com/WayneTomas/SSP-SAM).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
 
120
  ## 🚀 Training
121
 
 
125
  bash submit_train.sh
126
  ```
127
 
 
 
 
 
 
 
 
128
  You can also run directly:
129
 
130
  ```bash
 
136
  ### Resume Modes
137
 
138
  `train.py` supports two resume modes:
139
+ - `--resume <ckpt>`: use this for interrupted training and continue from the previous checkpoint.
140
  - `--resume_from_pretrain <ckpt>`: use this for loading pretrained weights before fine-tuning/training.
141
 
142
  ## 📊 Evaluation
 
157
  --checkpoint output/your_save_folder/checkpoint_best_miou.pth
158
  ```
159
 
 
 
 
 
 
 
 
 
 
160
  ## 🌈 Acknowledgements
161
 
162
  This repository benefits from ideas and/or codebases of the following projects:
 
163
  - SimREC: https://github.com/luogen1996/SimREC
164
  - gRefCOCO: https://github.com/henghuiding/gRefCOCO
165
  - TransVG: https://github.com/djiajunustc/TransVG
166
  - Segment Anything (SAM): https://github.com/facebookresearch/segment-anything
167
 
 
 
168
  ## 📚 Citation
169
 
170
  If you find this repository useful, please cite our SSP-SAM paper.
 
176
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
177
  year={2025}
178
  }
179
+ ```