wayneicloud
/

SSP-SAM

Model card Files Files and versions

xet

Community

Update model card: add pipeline tag and fix paper links

by nielsr HF Staff - opened Mar 20

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+13

-122

Files changed (1) hide show

README.md +13 -122

README.md CHANGED Viewed

@@ -1,10 +1,16 @@
 ---
 license: apache-2.0
 ---
 # SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation
 <div align="center">
-  <a href="https://arxiv.org/abs/xxxx.xxxxx"><img src="https://img.shields.io/badge/arXiv-Coming_Soon-b31b1b?style=flat-square" alt="arXiv"></a>
   <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Checkpoint-yellow?style=flat-square" alt="HF Checkpoint"></a>
   <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Dataset-orange?style=flat-square" alt="HF Dataset"></a>
   <img src="https://img.shields.io/badge/License-Apache--2.0-green?style=flat-square" alt="License">
@@ -29,7 +35,7 @@ license: apache-2.0
 ## Overview
-This repository provides the codebase of **SSP-SAM**, a referring expression segmentation framework built on top of SAM with semantic-spatial prompts.
 Current repo status:
 - Training/testing/data processing scripts are available.
@@ -48,7 +54,8 @@ Current repo status:
 ## 🔗 Model Zoo & Links
-- Paper: `https://arxiv.org/abs/xxxx.xxxxx`
 - <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="HF" width="16"/> Hugging Face Checkpoints/datasets: `https://huggingface.co/wayneicloud/SSP-SAM`
 ## 📁 Project Structure
@@ -108,104 +115,7 @@ You have two options:
      ```
 2. **Regenerate annotations/masks by yourself**
-   See the collapsible section below.
-<details>
-<summary>Generate Annotations/Masks by Yourself (click to expand)</summary>
-References:
-- `data_seg/README.md`
-- `data_seg/run.sh`
-- `legacy_data_prep_simrec.md` (legacy reference for raw data preparation and sources)
-Required raw annotation folders/files for generation include (examples):
-- `data_seg/refcoco/`
-- `data_seg/refcoco+/`
-- `data_seg/refcocog/`
-- `data_seg/refclef/`
-Each folder should contain raw files such as `instances.json` and `refs(...).p`.
-Minimal expected layout (example):
-```text
-data_seg/
-├── refcoco/
-│   ├── instances.json
-│   ├── refs(unc).p
-│   └── refs(google).p
-├── refcoco+/
-│   ├── instances.json
-│   └── refs(unc).p
-├── refcocog/
-│   ├── instances.json
-│   ├── refs(google).p
-│   └── refs(umd).p
-└── refclef/
-    ├── instances.json
-    ├── refs(unc).p
-    └── refs(berkeley).p
-```
-Example preprocessing command:
-```bash
-python ./data_seg/data_process.py \
-  --data_root ./data_seg \
-  --output_dir ./data_seg \
-  --dataset refcoco \
-  --split unc \
-  --generate_mask
-```
-</details>
-Detailed dataset path/config settings are defined in the corresponding preprocessing scripts/config files in `data_seg/`.
-Please modify them according to your local environment before running.
-Also check dataset/image path settings in:
-- `datasets/dataset.py`
-> Important: in `datasets/dataset.py`, class `VGDataset`, you should update local paths for images/annotations/masks according to your machine.
-Example local data organization:
-```text
-your_project_root/
-├── data/                                        # set --data_root to this folder
-│   ├── coco/
-│   │   └── train2014/                           # COCO images (unc/unc+/gref/gref_umd/grefcoco)
-│   ├── referit/
-│   │   └── images/                              # ReferIt images
-│   ├── VG/                                      # Visual Genome images (merge pretrain path)
-│   └── vg/                                      # Visual Genome images (phrase_cut path, if used)
-└── data_seg/                                    # same level as data/
-    ├── anns/
-    │   ├── refcoco.json
-    │   ├── refcoco+.json
-    │   ├── refcocog_umd.json
-    │   ├── refclef.json
-    │   └── grefcoco.json
-    └── masks/
-        ├── refcoco/
-        ├── refcoco+/
-        ├── refcocog_umd/
-        ├── refclef/
-        └── grefcoco/
-```
-For training/testing, use:
-- `data_seg/anns/*.json` (provided)
-- `data_seg/masks/*` (generated locally via `bash data_seg/run.sh`)
-### Required Images and Raw Data Sources
-For training/evaluation, you need the corresponding image files locally (COCO/Flickr/ReferIt/VG depending on dataset split and config).
-Common sources:
-- RefCOCO / RefCOCO+ / RefCOCOg / RefClef annotations: http://bvisionweb1.cs.unc.edu/licheng/referit/data/
-- MS COCO 2014 images: https://cocodataset.org/
-- Flickr30k images: http://shannon.cs.illinois.edu/DenotationGraph/
-- ReferItGame images: due to original dataset restrictions, please download by yourself from the official/authorized source.
-- Visual Genome images: https://visualgenome.org/
 ## 🚀 Training
@@ -215,13 +125,6 @@ Default training launcher:
 bash submit_train.sh
 ```
-`submit_train.sh` already includes commented examples for multiple datasets, e.g.:
-- `refcoco`
-- `refcoco+`
-- `refcocog_umd`
-- `referit`
-- `grefcoco`
 You can also run directly:
 ```bash
@@ -233,7 +136,7 @@ torchrun --nproc_per_node=8 train.py \
 ### Resume Modes
 `train.py` supports two resume modes:
-- `--resume <ckpt>`: use this for interrupted training and continue from the previous checkpoint (断点续训).
 - `--resume_from_pretrain <ckpt>`: use this for loading pretrained weights before fine-tuning/training.
 ## 📊 Evaluation
@@ -254,26 +157,14 @@ torchrun --nproc_per_node=1 --master_port=29590 test.py \
   --checkpoint output/your_save_folder/checkpoint_best_miou.pth
 ```
-## 📝 Notes
-- COCO image path in visualization prioritizes `data/coco/train2014`.
-- Current mask prediction/evaluation path uses `512x512` mask space.
-- Config files in `configs/` are set with:
-  - `output_dir='outputs/your_save_folder'`
-  - `batch_size=8`
-  - `freeze_epochs=20`
 ## 🌈 Acknowledgements
 This repository benefits from ideas and/or codebases of the following projects:
 - SimREC: https://github.com/luogen1996/SimREC
 - gRefCOCO: https://github.com/henghuiding/gRefCOCO
 - TransVG: https://github.com/djiajunustc/TransVG
 - Segment Anything (SAM): https://github.com/facebookresearch/segment-anything
-Thanks to the authors for their valuable open-source contributions.
 ## 📚 Citation
 If you find this repository useful, please cite our SSP-SAM paper.
@@ -285,4 +176,4 @@ If you find this repository useful, please cite our SSP-SAM paper.
   journal={IEEE Transactions on Circuits and Systems for Video Technology},
   year={2025}
 }
-```

 ---
 license: apache-2.0
+pipeline_tag: image-segmentation
+tags:
+- referring-expression-segmentation
+- sam
+- gres
 ---
 # SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation
 <div align="center">
+  <a href="https://arxiv.org/abs/2603.18086"><img src="https://img.shields.io/badge/arXiv-2603.18086-b31b1b?style=flat-square" alt="arXiv"></a>
   <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Checkpoint-yellow?style=flat-square" alt="HF Checkpoint"></a>
   <a href="https://huggingface.co/wayneicloud/SSP-SAM"><img src="https://img.shields.io/badge/HuggingFace-Dataset-orange?style=flat-square" alt="HF Dataset"></a>
   <img src="https://img.shields.io/badge/License-Apache--2.0-green?style=flat-square" alt="License">
 ## Overview
+This repository provides the codebase of **SSP-SAM**, a referring expression segmentation framework built on top of SAM with semantic-spatial prompts. The model is presented in the paper [SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation](https://arxiv.org/abs/2603.18086).
 Current repo status:
 - Training/testing/data processing scripts are available.
 ## 🔗 Model Zoo & Links
+- Paper: [SSP-SAM (arXiv:2603.18086)](https://arxiv.org/abs/2603.18086)
+- Code: [GitHub - WayneTomas/SSP-SAM](https://github.com/WayneTomas/SSP-SAM)
 - <img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" alt="HF" width="16"/> Hugging Face Checkpoints/datasets: `https://huggingface.co/wayneicloud/SSP-SAM`
 ## 📁 Project Structure
      ```
 2. **Regenerate annotations/masks by yourself**
+   See the collapsible section below in the [GitHub repository](https://github.com/WayneTomas/SSP-SAM).
 ## 🚀 Training
 bash submit_train.sh
 ```
 You can also run directly:
 ```bash
 ### Resume Modes
 `train.py` supports two resume modes:
+- `--resume <ckpt>`: use this for interrupted training and continue from the previous checkpoint.
 - `--resume_from_pretrain <ckpt>`: use this for loading pretrained weights before fine-tuning/training.
 ## 📊 Evaluation
   --checkpoint output/your_save_folder/checkpoint_best_miou.pth
 ```
 ## 🌈 Acknowledgements
 This repository benefits from ideas and/or codebases of the following projects:
 - SimREC: https://github.com/luogen1996/SimREC
 - gRefCOCO: https://github.com/henghuiding/gRefCOCO
 - TransVG: https://github.com/djiajunustc/TransVG
 - Segment Anything (SAM): https://github.com/facebookresearch/segment-anything
 ## 📚 Citation
 If you find this repository useful, please cite our SSP-SAM paper.
   journal={IEEE Transactions on Circuits and Systems for Video Technology},
   year={2025}
 }
+```