MVRL
/

VectorSynth

@@ -24,48 +24,6 @@ VectorSynth uses a two-stage pipeline:
 This model uses standard CLIP embeddings. For the COSA embedding variant, see [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA).
-## Usage
-```python
-import torch
-from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, DDIMScheduler
-from huggingface_hub import hf_hub_download
-device = "cuda"
-# Load ControlNet
-controlnet = ControlNetModel.from_pretrained("MVRL/VectorSynth", torch_dtype=torch.float16)
-# Load pipeline
-pipe = StableDiffusionControlNetPipeline.from_pretrained(
-    "stabilityai/stable-diffusion-2-1-base",
-    controlnet=controlnet,
-    torch_dtype=torch.float16
-)
-pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
-pipe = pipe.to(device)
-# Load RenderEncoder
-render_path = hf_hub_download("MVRL/VectorSynth", "render_encoder/clip-render_encoder.pth")
-checkpoint = torch.load(render_path, map_location=device, weights_only=False)
-render_encoder = checkpoint['model'].to(device).eval()
-# Your hint tensor should be (H, W, 768) - per-pixel CLIP embeddings of OSM text
-# hint = torch.load("your_hint.pt").to(device)
-# hint = hint.unsqueeze(0).permute(0, 3, 1, 2)  # (1, 768, H, W)
-# with torch.no_grad():
-#     control_image = render_encoder(hint).sigmoid()
-# Generate
-# output = pipe(
-#     prompt="Satellite image of a city neighborhood",
-#     image=control_image,
-#     num_inference_steps=40,
-#     guidance_scale=7.5
-# ).images[0]
-```
 ## Files
 - `config.json` - ControlNet configuration

 This model uses standard CLIP embeddings. For the COSA embedding variant, see [VectorSynth-COSA](https://huggingface.co/MVRL/VectorSynth-COSA).
 ## Files
 - `config.json` - ControlNet configuration