|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- controlnet |
|
|
- stable-diffusion |
|
|
- satellite-imagery |
|
|
- osm |
|
|
- image-to-image |
|
|
- diffusers |
|
|
base_model: stabilityai/stable-diffusion-2-1-base |
|
|
pipeline_tag: image-to-image |
|
|
library_name: diffusers |
|
|
--- |
|
|
|
|
|
# VectorSynth-COSA |
|
|
|
|
|
**VectorSynth-COSA** is a ControlNet model that generates satellite imagery from OpenStreetMap (OSM) vector data embeddings. It conditions [Stable Diffusion 2.1 Base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) on rendered OSM text using the COSA (Contrastive OSM-Satellite Alignment) embedding space. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
VectorSynth-COSA uses a two-stage pipeline: |
|
|
1. **RenderEncoder**: Projects 768-dim COSA embeddings to 3-channel control images |
|
|
2. **ControlNet**: Conditions Stable Diffusion 2.1 on the rendered control images |
|
|
|
|
|
This model uses COSA embeddings for improved semantic alignment between OSM text and satellite imagery. For the standard CLIP embedding variant, see [VectorSynth](https://huggingface.co/MVRL/VectorSynth). |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@inproceedings{cher2025vectorsynth, |
|
|
title={VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics}, |
|
|
author={Cher, Daniel and Wei, Brian and Sastry, Srikumar and Jacobs, Nathan}, |
|
|
year={2025}, |
|
|
eprint={arXiv:2511.07744}, |
|
|
note={arXiv preprint} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Related Models |
|
|
|
|
|
- [VectorSynth](https://huggingface.co/MVRL/VectorSynth) - Standard CLIP embedding variant |
|
|
- [GeoSynth](https://huggingface.co/MVRL/GeoSynth) - Text-to-satellite image generation |