Add comprehensive model card for SDGPA

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +104 -3
README.md CHANGED
@@ -1,3 +1,104 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-segmentation
4
+ library_name: diffusers
5
+ ---
6
+
7
+ # SDGPA: Zero Shot Domain Adaptive Semantic Segmentation by Synthetic Data Generation and Progressive Adaptation
8
+
9
+ Official implementation of paper: [**Zero Shot Domain Adaptive Semantic Segmentation by Synthetic Data Generation and Progressive Adaptation**](https://huggingface.co/papers/2508.03300) (IROS 25').
10
+
11
+ Code: [https://github.com/roujin/SDGPA](https://github.com/roujin/SDGPA)
12
+
13
+ <div align="center">
14
+ <img src="https://github.com/roujin/SDGPA/raw/main/poster_cvpr%20001.png" alt="SDGPA Overview" width="100%"/>
15
+ </div>
16
+
17
+ ## Abstract
18
+ Deep learning-based semantic segmentation models achieve impressive results yet remain limited in handling distribution shifts between training and test data. In this paper, we present SDGPA (Synthetic Data Generation and Progressive Adaptation), a novel method that tackles zero-shot domain adaptive semantic segmentation, in which no target images are available, but only a text description of the target domain's style is provided. To compensate for the lack of target domain training data, we utilize a pretrained off-the-shelf text-to-image diffusion model, which generates training images by transferring source domain images to target style. Directly editing source domain images introduces noise that harms segmentation because the layout of source images cannot be precisely maintained. To address inaccurate layouts in synthetic data, we propose a method that crops the source image, edits small patches individually, and then merges them back together, which helps improve spatial precision. Recognizing the large domain gap, SDGPA constructs an augmented intermediate domain, leveraging easier adaptation subtasks to enable more stable model adaptation to the target domain. Additionally, to mitigate the impact of noise in synthetic data, we design a progressive adaptation strategy, ensuring robust learning throughout the training process. Extensive experiments demonstrate that our method achieves state-of-the-art performance in zero-shot semantic segmentation.
19
+
20
+ ## Installation
21
+
22
+ Environment setting:
23
+
24
+ All of our experiments are conducted on NVIDIA RTX 3090 with cuda 11.8
25
+ ```bash
26
+ source env.sh
27
+ ```
28
+
29
+ ## Running
30
+
31
+ You can find all the training scripts in the `scripts/` folder.
32
+
33
+ We use day $\to$ snow setting as an example.
34
+
35
+ First, you should decide where you want to put the datasets. Let's denote it as `<data_root>` (for example:`/data3/roujin`). By default, the experimental logs are stored in `<data_root>`.
36
+
37
+ Then, organize the folder as follows:
38
+ ```
39
+ <data_root>
40
+ └─ ACDC
41
+ └─ gt
42
+ └─ rgb_anon
43
+ └─ cityscapes
44
+ └─ gtFine
45
+ └─ leftImg8bit
46
+ └─ GTA5
47
+ └─ images
48
+ └─ labels
49
+ ```
50
+
51
+ You can refer to cityscapes and ACDC's official websites for the datasets. For GTA5, as we only use a subset of them, we provide the following link to download the subset for your convenience: [https://huggingface.co/datasets/roujin/GTA5subset](https://huggingface.co/datasets/roujin/GTA5subset)
52
+
53
+ For synthetic data generation:
54
+ ```bash
55
+ source img_gen/run.sh <data_root> snow
56
+ ```
57
+
58
+ For progress model adaptation:
59
+ ```bash
60
+ source scripts/snow.sh <data_root>
61
+ ```
62
+
63
+ Evaluation:
64
+ ```bash
65
+ source eval.sh <data_root> <setting>
66
+ ```
67
+ `<setting>` can be "day", "fog", "rain", "snow", "night", "game"
68
+
69
+ ## Evaluation Results
70
+
71
+ We release the following results. See all logs and checkpoints during training from [https://huggingface.co/roujin/SDGPA/tree/main](https://huggingface.co/roujin/SDGPA/tree/main)
72
+
73
+ | Setting | Day→Night | Clear→Snow | Clear→Rain | Clear→Fog | Real→Game |
74
+ | :--------------- | :-------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------ | :------------------------------------------------------------------------------------- |
75
+ | results on paper | 26.9±0.8 | 47.4±0.7 | 48.6±0.8 | 58.8±0.7 | 43.4±0.4 |
76
+ | our released | 27.6 | 46.8 | 49.0 | 59.8 | 43.1 |
77
+ | checkpoint | [link](https://huggingface.co/roujin/SDGPA/blob/main/night2/weights/weights_65.pth.tar) | [link](https://huggingface.co/roujin/SDGPA/blob/main/snow2/weights/weights_65.pth.tar) | [link](https://huggingface.co/roujin/SDGPA/blob/main/rain2/weights/weights_65.pth.tar) | [link](https://huggingface.co/roujin/SDGPA/blob/main/fog2/weights/weights_65.pth.tar) | [link](https://huggingface.co/roujin/SDGPA/blob/main/game2/weights/weights_65.pth.tar) |
78
+
79
+ We recommend you to read the scripts and the paper for more details.
80
+
81
+ For hyperparameter selection of InstructPix2Pix, we recommend reading: [https://huggingface.co/spaces/timbrooks/instruct-pix2pix/blob/main/README.md](https://huggingface.co/spaces/timbrooks/instruct-pix2pix/blob/main/README.md)
82
+
83
+ ## Acknowledgements
84
+
85
+ This code is built upon the following repositories:
86
+
87
+ * [https://github.com/azuma164/ZoDi](https://github.com/azuma164/ZoDi)
88
+ * [https://huggingface.co/timbrooks/instruct-pix2pix](https://huggingface.co/timbrooks/instruct-pix2pix)
89
+
90
+ We thank them for their excellent work!
91
+
92
+ ## Citation
93
+
94
+ ```bibtex
95
+ @misc{luo2025sdgpa,
96
+ title={Zero Shot Domain Adaptive Semantic Segmentation by Synthetic Data Generation and Progressive Adaptation},
97
+ author={Jun Luo and Zijing Zhao and Yang Liu},
98
+ year={2025},
99
+ eprint={2508.03300},
100
+ archivePrefix={arXiv},
101
+ primaryClass={cs.CV},
102
+ url={https://arxiv.org/abs/2508.03300},
103
+ }
104
+ ```