MVRL
/

VectorSynth-COSA

StableDiffusionControlNetPipeline

stable-diffusion

satellite-imagery

Model card Files Files and versions

dcher95 commited on Feb 13

Commit

a8b5f22

·

verified ·

1 Parent(s): b844257

Update README.md

Files changed (1) hide show

README.md +0 -49

README.md CHANGED Viewed

@@ -24,55 +24,6 @@ VectorSynth-COSA uses a two-stage pipeline:
 This model uses COSA embeddings for improved semantic alignment between OSM text and satellite imagery. For the standard CLIP embedding variant, see [VectorSynth](https://huggingface.co/MVRL/VectorSynth).
-## Usage
-```python
-import torch
-from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, DDIMScheduler
-from huggingface_hub import hf_hub_download
-device = "cuda"
-# Load ControlNet
-controlnet = ControlNetModel.from_pretrained("MVRL/VectorSynth-COSA", torch_dtype=torch.float16)
-# Load pipeline
-pipe = StableDiffusionControlNetPipeline.from_pretrained(
-    "stabilityai/stable-diffusion-2-1-base",
-    controlnet=controlnet,
-    torch_dtype=torch.float16
-)
-pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
-pipe = pipe.to(device)
-# Load RenderEncoder
-render_path = hf_hub_download("MVRL/VectorSynth-COSA", "render_encoder/cosa-render_encoder.pth")
-checkpoint = torch.load(render_path, map_location=device, weights_only=False)
-render_encoder = checkpoint['model'].to(device).eval()
-# Your hint tensor should be (H, W, 768) - per-pixel OSMClip embeddings
-# hint = torch.load("your_hint.pt").to(device)
-# hint = hint.unsqueeze(0).permute(0, 3, 1, 2)  # (1, 768, H, W)
-# with torch.no_grad():
-#     control_image = render_encoder(hint).sigmoid()
-# Generate
-# output = pipe(
-#     prompt="Satellite image of a city neighborhood",
-#     image=control_image,
-#     num_inference_steps=40,
-#     guidance_scale=7.5
-# ).images[0]
-```
-## Files
-- `config.json` - ControlNet configuration
-- `diffusion_pytorch_model.safetensors` - ControlNet weights
-- `render_encoder/cosa-render_encoder.pth` - RenderEncoder weights
-- `render.py` - RenderEncoder class definition
 ## Citation
 ```bibtex

 This model uses COSA embeddings for improved semantic alignment between OSM text and satellite imagery. For the standard CLIP embedding variant, see [VectorSynth](https://huggingface.co/MVRL/VectorSynth).
 ## Citation
 ```bibtex