File size: 1,657 Bytes
667991a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
---
license: apache-2.0
tags:
- controlnet
- stable-diffusion
- satellite-imagery
- osm
- image-to-image
- diffusers
base_model: stabilityai/stable-diffusion-2-1-base
pipeline_tag: image-to-image
library_name: diffusers
---
# VectorSynth
**VectorSynth** is a ControlNet model that generates satellite imagery from OpenStreetMap (OSM) vector data embeddings. It conditions [Stable Diffusion 2.1 Base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) on rendered OSM text to synthesize realistic aerial imagery.
## Model Description
VectorSynth uses a two-stage pipeline:
1. **RenderEncoder**: Projects 768-dim CLIP text embeddings of OSM text to 3-channel control images
2. **ControlNet**: Conditions Stable Diffusion 2.1 on the rendered control images
This model uses standard CLIP embeddings. For the COSA embedding variant, see [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA).
## Files
- `config.json` - ControlNet configuration
- `diffusion_pytorch_model.safetensors` - ControlNet weights
- `render_encoder/clip-render_encoder.pth` - RenderEncoder weights
- `render.py` - RenderEncoder class definition
## Citation
```bibtex
@inproceedings{cher2025vectorsynth,
title={VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics},
author={Cher, Daniel and Wei, Brian and Sastry, Srikumar and Jacobs, Nathan},
year={2025},
eprint={arXiv:2511.07744},
note={arXiv preprint}
}
```
## Related Models
- [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA) - COSA embedding variant
- [GeoSynth](https://huggingface.co/MVRL/GeoSynth) - Text-to-satellite image generation |