--- license: apache-2.0 pipeline_tag: image-segmentation tags: - referring-expression-segmentation - sam - gres --- # SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation
arXiv HF Checkpoint HF Dataset License
Wei Tang1Xuejing Liu✉,2Yanpeng Sun3Zechao Li✉,1
1Nanjing University of Science and Technology;  2Institute of Computing Technology, Chinese Academy of Sciences;  3NExT++ Lab, National University of Singapore
Corresponding Authors
--- ## Overview This repository provides the codebase of **SSP-SAM**, a referring expression segmentation framework built on top of SAM with semantic-spatial prompts. The model is presented in the paper [SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation](https://arxiv.org/abs/2603.18086). Current repo status: - Training/testing/data processing scripts are available. - Multiple dataset configs are provided under `configs/`. ## 💥 News - **17 Mar, 2026**: Open-source codebase has been organized and released. - **4 Dec, 2025**: SSP-SAM paper accepted by IEEE TCSVT. ## 📌 ToDo - [X] Release final model checkpoints on Hugging Face - [X] Release processed training/evaluation metadata - [X] Release arXiv version ## 🔗 Model Zoo & Links - Paper: [SSP-SAM (arXiv:2603.18086)](https://arxiv.org/abs/2603.18086) - Code: [GitHub - WayneTomas/SSP-SAM](https://github.com/WayneTomas/SSP-SAM) - HF Hugging Face Checkpoints/datasets: `https://huggingface.co/wayneicloud/SSP-SAM` ## 📁 Project Structure ```text . ├── configs/ # training/evaluation configs ├── data_seg/ # data preprocessing scripts and generated anns/masks ├── datasets/ # dataloader and transforms ├── models/ # SSP_SAM model definitions ├── segment-anything/ # modified SAM dependency (editable install) ├── train.py # training entry ├── test.py # evaluation entry ├── submit_train.sh # train launcher (with examples) └── submit_test.sh # test launcher (with examples) ``` ## ⚙️ Environment Setup Recommended: conda environment on macOS/Linux. ```bash conda create -n ssp_sam python=3.10 -y conda activate ssp_sam pip install --upgrade pip # 1) install PyTorch (CUDA example: cu121) pip install torch==2.1.0+cu121 torchvision==0.16.0+cu121 torchaudio==2.1.0+cu121 --index-url https://download.pytorch.org/whl/cu121 # 2) install modified segment-anything first cd segment-anything pip install -e . cd .. # 3) install remaining dependencies pip install -r requirements.txt ``` > Note: the `segment-anything` code in this repository has been modified based on the original SAM implementation. > Please install the local `segment-anything` in editable mode (`pip install -e .`) as shown above. ## 🧩 Data Preparation Please check: - `data_seg/README.md` - `data_seg/run.sh` You have two options: 1. **Use our provided annotations + generate masks locally (recommended)** - HF Download `data_seg/anns/*.json` and other prepared `data_seg` files from Hugging Face: `https://huggingface.co/wayneicloud/SSP-SAM` - You can directly use our `data_seg/anns/*.json`. - `masks` should be generated on your side by running: ```bash bash data_seg/run.sh ``` 2. **Regenerate annotations/masks by yourself** See the collapsible section below in the [GitHub repository](https://github.com/WayneTomas/SSP-SAM). ## 🚀 Training Default training launcher: ```bash bash submit_train.sh ``` You can also run directly: ```bash torchrun --nproc_per_node=8 train.py \ --config configs/SSP_SAM_CLIP_B_FT_unc.py \ --clip_pretrained pretrained_checkpoints/CS/CS-ViT-B-16.pt ``` ### Resume Modes `train.py` supports two resume modes: - `--resume `: use this for interrupted training and continue from the previous checkpoint. - `--resume_from_pretrain `: use this for loading pretrained weights before fine-tuning/training. ## 📊 Evaluation Default testing launcher: ```bash bash submit_test.sh ``` Example direct command: ```bash torchrun --nproc_per_node=1 --master_port=29590 test.py \ --config configs/SSP_SAM_CLIP_L_FT_unc.py \ --test_split testB \ --clip_pretrained pretrained_checkpoints/CS/CS-ViT-L-14-336px.pt \ --checkpoint output/your_save_folder/checkpoint_best_miou.pth ``` ## 🌈 Acknowledgements This repository benefits from ideas and/or codebases of the following projects: - SimREC: https://github.com/luogen1996/SimREC - gRefCOCO: https://github.com/henghuiding/gRefCOCO - TransVG: https://github.com/djiajunustc/TransVG - Segment Anything (SAM): https://github.com/facebookresearch/segment-anything ## 📚 Citation If you find this repository useful, please cite our SSP-SAM paper. ```bibtex @article{ssp_sam_tcsvt, title={SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation}, author={Tang, Wei and Liu, Xuejing and Sun, Yanpeng and Li, Zechao}, journal={IEEE Transactions on Circuits and Systems for Video Technology}, year={2025} } ```