--- license: apache-2.0 tags: - controlnet - stable-diffusion - satellite-imagery - osm - image-to-image - diffusers base_model: stabilityai/stable-diffusion-2-1-base pipeline_tag: image-to-image library_name: diffusers --- # VectorSynth **VectorSynth** is a ControlNet model that generates satellite imagery from OpenStreetMap (OSM) vector data embeddings. It conditions [Stable Diffusion 2.1 Base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) on rendered OSM text to synthesize realistic aerial imagery. ## Model Description VectorSynth uses a two-stage pipeline: 1. **RenderEncoder**: Projects 768-dim CLIP text embeddings of OSM text to 3-channel control images 2. **ControlNet**: Conditions Stable Diffusion 2.1 on the rendered control images This model uses standard CLIP embeddings. For the COSA embedding variant, see [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA). ## Files - `config.json` - ControlNet configuration - `diffusion_pytorch_model.safetensors` - ControlNet weights - `render_encoder/clip-render_encoder.pth` - RenderEncoder weights - `render.py` - RenderEncoder class definition ## Citation ```bibtex @inproceedings{cher2025vectorsynth, title={VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics}, author={Cher, Daniel and Wei, Brian and Sastry, Srikumar and Jacobs, Nathan}, year={2025}, eprint={arXiv:2511.07744}, note={arXiv preprint} } ``` ## Related Models - [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA) - COSA embedding variant - [GeoSynth](https://huggingface.co/MVRL/GeoSynth) - Text-to-satellite image generation