Improve model card: Add pipeline_tag, library_name, update paper link, and clean up content
Browse filesThis PR enhances the model card for CoMPaSS-SD1.5, based on the paper [CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models](https://huggingface.co/papers/2412.13195).
Specifically, it addresses the following points:
- Adds `pipeline_tag: text-to-image` to the metadata, improving model discoverability via the Hugging Face Hub filters.
- Adds `library_name: diffusers` to the metadata, enabling direct integration and usage snippets within the Hub UI.
- Updates the paper link within the model card content to the official Hugging Face paper page.
- Reformats inline links for the project page, code, and paper for better readability and prominence.
- Removes a redundant title heading (`# CoMPaSS-SD1.5`) under the model description for cleaner presentation.
- Updates the BibTeX citation to include the URL to the Hugging Face paper page.
- Updates the license link to be an absolute URL for robustness.
These changes provide clearer and more structured information for users, facilitating easier discovery and understanding of the model's capabilities and usage.
|
@@ -1,4 +1,8 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
tags:
|
| 3 |
- text-to-image
|
| 4 |
- diffusers
|
|
@@ -12,31 +16,26 @@ widget:
|
|
| 12 |
- text: a photo of a sheep below a sink
|
| 13 |
output:
|
| 14 |
url: images/sheep-below-sink.jpg
|
| 15 |
-
base_model: runwayml/stable-diffusion-v1-5
|
| 16 |
-
license: apache-2.0
|
| 17 |
---
|
|
|
|
| 18 |
# CoMPaSS-SD1.5
|
| 19 |
|
| 20 |
<Gallery />
|
| 21 |
|
| 22 |
## Model description
|
| 23 |
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
\[[Project Page]\]
|
| 27 |
-
\[[code]\]
|
| 28 |
-
\[[arXiv]\]
|
| 29 |
-
|
| 30 |
-
A UNet that enhances spatial understanding capabilities of the StableDiffusion 1.5 text-to-image
|
| 31 |
-
diffusion model. This model demonstrates significant improvements in generating images with specific
|
| 32 |
spatial relationships between objects.
|
| 33 |
|
|
|
|
|
|
|
| 34 |
## Model Details
|
| 35 |
|
| 36 |
- **Base Model**: StableDiffusion 1.5
|
| 37 |
- **Training Data**: SCOP dataset (curated from COCO)
|
| 38 |
- **Framework**: Diffusers
|
| 39 |
-
- **License**: Apache-2.0 (see [./LICENSE])
|
| 40 |
|
| 41 |
## Intended Use
|
| 42 |
|
|
@@ -55,7 +54,7 @@ spatial relationships between objects.
|
|
| 55 |
|
| 56 |
## Using the Model
|
| 57 |
|
| 58 |
-
See our [GitHub repository]
|
| 59 |
|
| 60 |
### Effective Prompting
|
| 61 |
|
|
@@ -103,7 +102,8 @@ If you use this model in your research, please cite:
|
|
| 103 |
title={CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models},
|
| 104 |
author={Zhang, Gaoyang and Fu, Bingtao and Fan, Qingnan and Zhang, Qi and Liu, Runxing and Gu, Hong and Zhang, Huaqi and Liu, Xinguo},
|
| 105 |
booktitle={ICCV},
|
| 106 |
-
year={2025}
|
|
|
|
| 107 |
}
|
| 108 |
```
|
| 109 |
|
|
@@ -113,9 +113,4 @@ For questions about the model, please contact <blurgy@zju.edu.cn>
|
|
| 113 |
|
| 114 |
## Download model
|
| 115 |
|
| 116 |
-
Weights for this model are available in Safetensors format.
|
| 117 |
-
|
| 118 |
-
[./LICENSE]: <./LICENSE>
|
| 119 |
-
[Project page]: <https://compass.blurgy.xyz>
|
| 120 |
-
[code]: <https://github.com/blurgyy/CoMPaSS>
|
| 121 |
-
[arXiv]: <https://arxiv.org/abs/2412.13195>
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: runwayml/stable-diffusion-v1-5
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
pipeline_tag: text-to-image
|
| 5 |
+
library_name: diffusers
|
| 6 |
tags:
|
| 7 |
- text-to-image
|
| 8 |
- diffusers
|
|
|
|
| 16 |
- text: a photo of a sheep below a sink
|
| 17 |
output:
|
| 18 |
url: images/sheep-below-sink.jpg
|
|
|
|
|
|
|
| 19 |
---
|
| 20 |
+
|
| 21 |
# CoMPaSS-SD1.5
|
| 22 |
|
| 23 |
<Gallery />
|
| 24 |
|
| 25 |
## Model description
|
| 26 |
|
| 27 |
+
CoMPaSS-SD1.5 is a UNet that enhances spatial understanding capabilities of the StableDiffusion 1.5 text-to-image
|
| 28 |
+
diffusion model. This model demonstrates significant improvements in generating images with specific
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
spatial relationships between objects.
|
| 30 |
|
| 31 |
+
**[Project Page](https://compass.blurgy.xyz)** | **[Code](https://github.com/blurgyy/CoMPaSS)** | **[Paper](https://huggingface.co/papers/2412.13195)**
|
| 32 |
+
|
| 33 |
## Model Details
|
| 34 |
|
| 35 |
- **Base Model**: StableDiffusion 1.5
|
| 36 |
- **Training Data**: SCOP dataset (curated from COCO)
|
| 37 |
- **Framework**: Diffusers
|
| 38 |
+
- **License**: Apache-2.0 (see [./LICENSE](https://github.com/blurgyy/CoMPaSS/blob/main/LICENSE))
|
| 39 |
|
| 40 |
## Intended Use
|
| 41 |
|
|
|
|
| 54 |
|
| 55 |
## Using the Model
|
| 56 |
|
| 57 |
+
See our [GitHub repository](https://github.com/blurgyy/CoMPaSS) to get started.
|
| 58 |
|
| 59 |
### Effective Prompting
|
| 60 |
|
|
|
|
| 102 |
title={CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models},
|
| 103 |
author={Zhang, Gaoyang and Fu, Bingtao and Fan, Qingnan and Zhang, Qi and Liu, Runxing and Gu, Hong and Zhang, Huaqi and Liu, Xinguo},
|
| 104 |
booktitle={ICCV},
|
| 105 |
+
year={2025},
|
| 106 |
+
url={https://huggingface.co/papers/2412.13195}
|
| 107 |
}
|
| 108 |
```
|
| 109 |
|
|
|
|
| 113 |
|
| 114 |
## Download model
|
| 115 |
|
| 116 |
+
Weights for this model are available in Safetensors format.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|