SSP-SAM / README.md

nielsr HF Staff

Update model card: add pipeline tag and fix paper links

21c7fad verified 2 months ago

6.47 kB

license: apache-2.0
pipeline_tag: image-segmentation
tags:
  - referring-expression-segmentation
  - sam
  - gres

SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation

Wei Tang¹ Xuejing Liu^✉,2 Yanpeng Sun³ Zechao Li^✉,1

¹Nanjing University of Science and Technology; ²Institute of Computing Technology, Chinese Academy of Sciences; ³NExT++ Lab, National University of Singapore
^✉ Corresponding Authors

Overview

This repository provides the codebase of SSP-SAM, a referring expression segmentation framework built on top of SAM with semantic-spatial prompts. The model is presented in the paper SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation.

Current repo status:

Training/testing/data processing scripts are available.
Multiple dataset configs are provided under configs/.

💥 News

17 Mar, 2026: Open-source codebase has been organized and released.
4 Dec, 2025: SSP-SAM paper accepted by IEEE TCSVT.

📌 ToDo

Release final model checkpoints on Hugging Face
Release processed training/evaluation metadata
Release arXiv version

🔗 Model Zoo & Links

Paper: SSP-SAM (arXiv:2603.18086)
Code: GitHub - WayneTomas/SSP-SAM
Hugging Face Checkpoints/datasets: https://huggingface.co/wayneicloud/SSP-SAM

📁 Project Structure

.
├── configs/                 # training/evaluation configs
├── data_seg/                # data preprocessing scripts and generated anns/masks
├── datasets/                # dataloader and transforms
├── models/                  # SSP_SAM model definitions
├── segment-anything/        # modified SAM dependency (editable install)
├── train.py                 # training entry
├── test.py                  # evaluation entry
├── submit_train.sh          # train launcher (with examples)
└── submit_test.sh           # test launcher (with examples)

⚙️ Environment Setup

Recommended: conda environment on macOS/Linux.

conda create -n ssp_sam python=3.10 -y
conda activate ssp_sam
pip install --upgrade pip

# 1) install PyTorch (CUDA example: cu121)
pip install torch==2.1.0+cu121 torchvision==0.16.0+cu121 torchaudio==2.1.0+cu121 --index-url https://download.pytorch.org/whl/cu121

# 2) install modified segment-anything first
cd segment-anything
pip install -e .
cd ..

# 3) install remaining dependencies
pip install -r requirements.txt

Note: the segment-anything code in this repository has been modified based on the original SAM implementation.
Please install the local segment-anything in editable mode (pip install -e .) as shown above.

🧩 Data Preparation

Please check:

data_seg/README.md
data_seg/run.sh

You have two options:

Use our provided annotations + generate masks locally (recommended)
- Download data_seg/anns/*.json and other prepared data_seg files from Hugging Face:
  https://huggingface.co/wayneicloud/SSP-SAM
- You can directly use our data_seg/anns/*.json.
- masks should be generated on your side by running:
```
bash data_seg/run.sh
```
Regenerate annotations/masks by yourself
See the collapsible section below in the GitHub repository.

🚀 Training

Default training launcher:

bash submit_train.sh

You can also run directly:

torchrun --nproc_per_node=8 train.py \
  --config configs/SSP_SAM_CLIP_B_FT_unc.py \
  --clip_pretrained pretrained_checkpoints/CS/CS-ViT-B-16.pt

Resume Modes

train.py supports two resume modes:

--resume <ckpt>: use this for interrupted training and continue from the previous checkpoint.
--resume_from_pretrain <ckpt>: use this for loading pretrained weights before fine-tuning/training.

📊 Evaluation

Default testing launcher:

bash submit_test.sh

Example direct command:

torchrun --nproc_per_node=1 --master_port=29590 test.py \
  --config configs/SSP_SAM_CLIP_L_FT_unc.py \
  --test_split testB \
  --clip_pretrained pretrained_checkpoints/CS/CS-ViT-L-14-336px.pt \
  --checkpoint output/your_save_folder/checkpoint_best_miou.pth

🌈 Acknowledgements

This repository benefits from ideas and/or codebases of the following projects:

SimREC: https://github.com/luogen1996/SimREC
gRefCOCO: https://github.com/henghuiding/gRefCOCO
TransVG: https://github.com/djiajunustc/TransVG
Segment Anything (SAM): https://github.com/facebookresearch/segment-anything

📚 Citation

If you find this repository useful, please cite our SSP-SAM paper.

@article{ssp_sam_tcsvt,
  title={SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation},
  author={Tang, Wei and Liu, Xuejing and Sun, Yanpeng and Li, Zechao},
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
  year={2025}
}