MVRL
/

VectorSynth

StableDiffusionControlNetPipeline

stable-diffusion

satellite-imagery

Model card Files Files and versions

VectorSynth / README.md

dcher95's picture

Update README.md

d7febc3 verified 7 days ago

|

history blame contribute delete

1.66 kB

	---
	license: apache-2.0
	tags:
	- controlnet
	- stable-diffusion
	- satellite-imagery
	- osm
	- image-to-image
	- diffusers
	base_model: stabilityai/stable-diffusion-2-1-base
	pipeline_tag: image-to-image
	library_name: diffusers
	---

	# VectorSynth

	VectorSynth is a ControlNet model that generates satellite imagery from OpenStreetMap (OSM) vector data embeddings. It conditions [Stable Diffusion 2.1 Base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) on rendered OSM text to synthesize realistic aerial imagery.

	## Model Description

	VectorSynth uses a two-stage pipeline:
	1. RenderEncoder: Projects 768-dim CLIP text embeddings of OSM text to 3-channel control images
	2. ControlNet: Conditions Stable Diffusion 2.1 on the rendered control images

	This model uses standard CLIP embeddings. For the COSA embedding variant, see [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA).

	## Files

	- `config.json` - ControlNet configuration
	- `diffusion_pytorch_model.safetensors` - ControlNet weights
	- `render_encoder/clip-render_encoder.pth` - RenderEncoder weights
	- `render.py` - RenderEncoder class definition

	## Citation

	```bibtex
	@inproceedings{cher2025vectorsynth,
	title={VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics},
	author={Cher, Daniel and Wei, Brian and Sastry, Srikumar and Jacobs, Nathan},
	year={2025},
	eprint={arXiv:2511.07744},
	note={arXiv preprint}
	}
	```

	## Related Models

	- [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA) - COSA embedding variant
	- [GeoSynth](https://huggingface.co/MVRL/GeoSynth) - Text-to-satellite image generation