dcher95 commited on
Commit
a8b5f22
·
verified ·
1 Parent(s): b844257

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -49
README.md CHANGED
@@ -24,55 +24,6 @@ VectorSynth-COSA uses a two-stage pipeline:
24
 
25
  This model uses COSA embeddings for improved semantic alignment between OSM text and satellite imagery. For the standard CLIP embedding variant, see [VectorSynth](https://huggingface.co/MVRL/VectorSynth).
26
 
27
- ## Usage
28
-
29
- ```python
30
- import torch
31
- from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, DDIMScheduler
32
- from huggingface_hub import hf_hub_download
33
-
34
- device = "cuda"
35
-
36
- # Load ControlNet
37
- controlnet = ControlNetModel.from_pretrained("MVRL/VectorSynth-COSA", torch_dtype=torch.float16)
38
-
39
- # Load pipeline
40
- pipe = StableDiffusionControlNetPipeline.from_pretrained(
41
- "stabilityai/stable-diffusion-2-1-base",
42
- controlnet=controlnet,
43
- torch_dtype=torch.float16
44
- )
45
- pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
46
- pipe = pipe.to(device)
47
-
48
- # Load RenderEncoder
49
- render_path = hf_hub_download("MVRL/VectorSynth-COSA", "render_encoder/cosa-render_encoder.pth")
50
- checkpoint = torch.load(render_path, map_location=device, weights_only=False)
51
- render_encoder = checkpoint['model'].to(device).eval()
52
-
53
- # Your hint tensor should be (H, W, 768) - per-pixel OSMClip embeddings
54
- # hint = torch.load("your_hint.pt").to(device)
55
- # hint = hint.unsqueeze(0).permute(0, 3, 1, 2) # (1, 768, H, W)
56
-
57
- # with torch.no_grad():
58
- # control_image = render_encoder(hint).sigmoid()
59
-
60
- # Generate
61
- # output = pipe(
62
- # prompt="Satellite image of a city neighborhood",
63
- # image=control_image,
64
- # num_inference_steps=40,
65
- # guidance_scale=7.5
66
- # ).images[0]
67
- ```
68
-
69
- ## Files
70
-
71
- - `config.json` - ControlNet configuration
72
- - `diffusion_pytorch_model.safetensors` - ControlNet weights
73
- - `render_encoder/cosa-render_encoder.pth` - RenderEncoder weights
74
- - `render.py` - RenderEncoder class definition
75
-
76
  ## Citation
77
 
78
  ```bibtex
 
24
 
25
  This model uses COSA embeddings for improved semantic alignment between OSM text and satellite imagery. For the standard CLIP embedding variant, see [VectorSynth](https://huggingface.co/MVRL/VectorSynth).
26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ## Citation
28
 
29
  ```bibtex