MVRL
/

VectorSynth-COSA

StableDiffusionControlNetPipeline

stable-diffusion

satellite-imagery

Model card Files Files and versions

VectorSynth-COSA / README.md

dcher95's picture

Update README.md

a8b5f22 verified 6 days ago

|

history blame contribute delete

1.52 kB

	---
	license: apache-2.0
	tags:
	- controlnet
	- stable-diffusion
	- satellite-imagery
	- osm
	- image-to-image
	- diffusers
	base_model: stabilityai/stable-diffusion-2-1-base
	pipeline_tag: image-to-image
	library_name: diffusers
	---

	# VectorSynth-COSA

	VectorSynth-COSA is a ControlNet model that generates satellite imagery from OpenStreetMap (OSM) vector data embeddings. It conditions [Stable Diffusion 2.1 Base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) on rendered OSM text using the COSA (Contrastive OSM-Satellite Alignment) embedding space.

	## Model Description

	VectorSynth-COSA uses a two-stage pipeline:
	1. RenderEncoder: Projects 768-dim COSA embeddings to 3-channel control images
	2. ControlNet: Conditions Stable Diffusion 2.1 on the rendered control images

	This model uses COSA embeddings for improved semantic alignment between OSM text and satellite imagery. For the standard CLIP embedding variant, see [VectorSynth](https://huggingface.co/MVRL/VectorSynth).

	## Citation

	```bibtex
	@inproceedings{cher2025vectorsynth,
	title={VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics},
	author={Cher, Daniel and Wei, Brian and Sastry, Srikumar and Jacobs, Nathan},
	year={2025},
	eprint={arXiv:2511.07744},
	note={arXiv preprint}
	}
	```

	## Related Models

	- [VectorSynth](https://huggingface.co/MVRL/VectorSynth) - Standard CLIP embedding variant
	- [GeoSynth](https://huggingface.co/MVRL/GeoSynth) - Text-to-satellite image generation