merve HF Staff commited on
Commit
7d99dfd
·
verified ·
1 Parent(s): e9a192d

Add Gemma Diffusion website builder with google/diffusiongemma-26B-A4B-it

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ transformers-5.8.0.dev0+2db78a1296-py3-none-any.whl filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,14 +1,46 @@
1
  ---
2
- title: Diffusiongemma Codegen
3
- emoji: 🦀
4
- colorFrom: yellow
5
- colorTo: gray
6
  sdk: gradio
7
  sdk_version: 6.17.3
8
- python_version: '3.13'
9
  app_file: app.py
10
  pinned: false
11
- license: apache-2.0
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Gemma Diffusion Website Builder
3
+ emoji: 🌐
4
+ colorFrom: indigo
5
+ colorTo: purple
6
  sdk: gradio
7
  sdk_version: 6.17.3
8
+ python_version: "3.12"
9
  app_file: app.py
10
  pinned: false
11
+ short_description: Watch a diffusion LLM write a website live, then tweak it
12
  ---
13
 
14
+ # Gemma Diffusion · Live Website Builder
15
+
16
+ A live, side-by-side visualization of **block-diffusion** code generation. Describe a
17
+ website and the model writes a single self-contained HTML document by denoising a canvas
18
+ of random tokens — every token position updates *at once* each step.
19
+
20
+ The left pane shows the raw HTML *canvas* diffusing token-by-token (the signature look of
21
+ text diffusion); the right pane renders the page live. Type a follow-up to **tweak in
22
+ place**: the previous page seeds the diffusion's starting canvas (via the model's native
23
+ `canvas_ids` API), so the model edits the existing page instead of regenerating from
24
+ scratch. Changed lines are highlighted, and the preview keeps your scroll position across
25
+ re-renders.
26
+
27
+ ## Stack
28
+
29
+ - **Model**: `google/diffusiongemma-26B-A4B-it` (`DiffusionGemmaForBlockDiffusion`).
30
+ - **Backend**: [`gradio.Server`](https://huggingface.co/blog/introducing-gradio-server)
31
+ — a FastAPI subclass that provides Gradio's queue + SSE streaming under a custom,
32
+ hand-written HTML/CSS/JS frontend (`index.html`). The single streaming endpoint
33
+ `/generate` yields one JSON frame per denoising step.
34
+ - **Hardware**: ZeroGPU (`xlarge`) — the 26B checkpoint needs the full backing card.
35
+
36
+ A custom `transformers` wheel providing the DiffusionGemma architecture is bundled in
37
+ this repo and installed at runtime by `app.py` (Spaces installs `requirements.txt` before
38
+ the repo files are copied in, so a local-path wheel can't be referenced there).
39
+
40
+ ## Configuration
41
+
42
+ - `HF_TOKEN` (secret) — read access to the private model repo.
43
+ - `GRADIO_SSR_MODE=false` (variable) — required so the custom `/` route serves
44
+ `index.html` instead of Gradio's SSR shell.
45
+ - `GDIFF_MODEL_PATH` (optional) — override the model repo id.
46
+ - `GDIFF_GPU_SIZE` (optional) — ZeroGPU slice, defaults to `xlarge`.
app.py ADDED
@@ -0,0 +1,365 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Gemma Diffusion — live website builder (gradio.Server backend + custom frontend).
3
+
4
+ ZeroGPU port. `gradio.Server` (a FastAPI subclass) gives us Gradio's queue + SSE
5
+ streaming while we serve our own hand-written HTML/CSS/JS frontend. The single
6
+ streaming endpoint `/generate` runs the block-diffusion model and yields JSON frames
7
+ (one per denoising step) that the frontend renders side-by-side: the raw HTML canvas
8
+ diffusing on the left, the live rendered page on the right.
9
+
10
+ ZeroGPU specifics:
11
+ - `import spaces` happens before `torch`.
12
+ - The model is loaded once at module scope with `.to("cuda")` (ZeroGPU registers it).
13
+ - The actual `model.generate` call lives inside the `@spaces.GPU` function `_gpu_stream`;
14
+ the `gradio.Server` endpoint only marshals picklable CPU tensors in/out of it.
15
+
16
+ Refs:
17
+ - https://huggingface.co/blog/introducing-gradio-server
18
+ - https://huggingface.co/docs/hub/spaces-zerogpu
19
+ """
20
+
21
+ import glob
22
+ import os
23
+ import subprocess
24
+ import sys
25
+
26
+ # Set before torch is imported (transformers pulls torch in).
27
+ os.environ.setdefault("PYTORCH_CUDA_ALLOC_CONF", "expandable_segments:True")
28
+
29
+ import spaces # must precede torch so ZeroGPU can patch it
30
+
31
+
32
+ def _ensure_transformers():
33
+ """Install the bundled custom DiffusionGemma `transformers` wheel at runtime.
34
+
35
+ Spaces installs `requirements.txt` *before* copying the repo files into the image,
36
+ so the wheel can't be referenced by local path there. By the time this app runs the
37
+ file is present in the working directory, so we install it here (only if a stock /
38
+ no transformers is importable) before importing torch/transformers below.
39
+ """
40
+ try:
41
+ import transformers # noqa: F401
42
+
43
+ if hasattr(transformers, "DiffusionGemmaForBlockDiffusion") or hasattr(
44
+ getattr(transformers, "models", object), "diffusion_gemma"
45
+ ):
46
+ return
47
+ except Exception:
48
+ pass
49
+ wheels = sorted(glob.glob(os.path.join(os.path.dirname(os.path.abspath(__file__)), "transformers-*.whl")))
50
+ if not wheels:
51
+ return
52
+ print(f"[gdiff] Installing bundled transformers wheel: {os.path.basename(wheels[0])}", flush=True)
53
+ subprocess.check_call([sys.executable, "-m", "pip", "install", "--no-cache-dir", wheels[0]])
54
+ import importlib
55
+
56
+ importlib.invalidate_caches()
57
+
58
+
59
+ _ensure_transformers()
60
+
61
+ import json
62
+ import queue as queue_lib
63
+ import re
64
+ import threading
65
+ import time as _time
66
+
67
+ import torch
68
+ from fastapi.responses import HTMLResponse
69
+ from gradio import Server
70
+ from transformers import AutoTokenizer, DiffusionGemmaForBlockDiffusion
71
+ from transformers.generation.streamers import BaseStreamer
72
+
73
+ HERE = os.path.dirname(os.path.abspath(__file__))
74
+ MODEL_PATH = os.environ.get("GDIFF_MODEL_PATH", "google/diffusiongemma-26B-A4B-it")
75
+ HF_TOKEN = os.environ.get("HF_TOKEN")
76
+ MAX_ITERS_CAP = 120 # hard cap on denoising steps per block
77
+ # ZeroGPU: the 26B checkpoint (~49 GB bf16) needs the full backing card.
78
+ GPU_SIZE = os.environ.get("GDIFF_GPU_SIZE", "xlarge")
79
+
80
+ SYSTEM_PROMPT = (
81
+ "You are an expert front-end web developer with great visual taste. When asked to "
82
+ "build or change a web page, respond with a SINGLE, complete, self-contained HTML5 "
83
+ "document. Put all CSS in a <style> tag and any JavaScript in a <script> tag inside "
84
+ "the document. Do not load external assets. When asked to modify an existing page, "
85
+ "return the FULL updated HTML document with the change applied. Do not include "
86
+ "explanations or markdown code fences — output only raw HTML, starting with "
87
+ "<!DOCTYPE html>."
88
+ )
89
+
90
+ _MARKER_RE = re.compile(
91
+ r"<\|?(?:channel|turn|think|image|audio|video|tool(?:_call|_response)?)\|?>"
92
+ )
93
+ _FENCE_RE = re.compile(r"```(?:html)?\s*(.*?)\s*```", re.DOTALL)
94
+
95
+
96
+ # --------------------------------------------------------------------------- #
97
+ # Model (loaded once at module scope; ZeroGPU registers .to("cuda") tensors)
98
+ # --------------------------------------------------------------------------- #
99
+ DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
100
+ print(f"[gdiff] Loading model from {MODEL_PATH} on {DEVICE} ...", flush=True)
101
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, token=HF_TOKEN)
102
+ model = DiffusionGemmaForBlockDiffusion.from_pretrained(
103
+ MODEL_PATH,
104
+ dtype=torch.bfloat16,
105
+ low_cpu_mem_usage=True,
106
+ token=HF_TOKEN,
107
+ ).to(DEVICE)
108
+ model.eval()
109
+ CANVAS_LEN = model.config.canvas_length
110
+ PAD_ID = tokenizer.pad_token_id or 0
111
+ print(f"[gdiff] Model ready | canvas_length={CANVAS_LEN}", flush=True)
112
+
113
+ # Cache of the last *cleaned* page so a follow-up tweak can warm-start in place.
114
+ model._last_clean_html = None
115
+
116
+
117
+ # --------------------------------------------------------------------------- #
118
+ # Helpers (CPU-only; safe to run in the gradio.Server main process)
119
+ # --------------------------------------------------------------------------- #
120
+ def warm_canvas_from_cache():
121
+ """Starting canvas (first block) built from the previous *cleaned* page.
122
+
123
+ Returns a CPU tensor (it is pickled across the ZeroGPU process boundary and moved
124
+ to CUDA inside the GPU worker). We re-tokenize the cleaned HTML rather than reuse
125
+ raw output tokens so a mangled header can't compound across tweaks.
126
+ """
127
+ html = getattr(model, "_last_clean_html", None)
128
+ if not html:
129
+ return None
130
+ ids = tokenizer(html, add_special_tokens=False).input_ids[:CANVAS_LEN]
131
+ if not ids:
132
+ return None
133
+ if len(ids) < CANVAS_LEN:
134
+ ids = ids + [PAD_ID] * (CANVAS_LEN - len(ids))
135
+ return torch.tensor(ids, dtype=torch.long).unsqueeze(0)
136
+
137
+
138
+ def last_assistant_html(history_json: str):
139
+ try:
140
+ history = json.loads(history_json) if history_json else []
141
+ except json.JSONDecodeError:
142
+ return None
143
+ for turn in reversed(history):
144
+ if turn.get("role") == "assistant" and turn.get("content"):
145
+ return turn["content"]
146
+ return None
147
+
148
+
149
+ def clean_text(text: str) -> str:
150
+ return _MARKER_RE.sub("", text).lstrip()
151
+
152
+
153
+ def extract_html(text: str) -> str:
154
+ """Pull a usable HTML document out of the (possibly mangled) model output.
155
+
156
+ Anchor on the first intact structural tag and rebuild whatever the diffused tweak ate
157
+ off the front, so the result is always a valid document (never quirks mode and never a
158
+ broken ``DOCTYPE>`` / ``html lang=`` header).
159
+ """
160
+ text = clean_text(text)
161
+ fenced = _FENCE_RE.search(text)
162
+ if fenced:
163
+ text = fenced.group(1)
164
+ lower = text.lower()
165
+ dt = lower.find("<!doctype")
166
+ if dt != -1:
167
+ return text[dt:].strip()
168
+ h = lower.find("<html")
169
+ if h != -1:
170
+ return "<!DOCTYPE html>\n" + text[h:].strip()
171
+ hd = lower.find("<head")
172
+ if hd != -1:
173
+ return '<!DOCTYPE html>\n<html lang="en">\n' + text[hd:].strip()
174
+ bd = lower.find("<body")
175
+ if bd != -1:
176
+ return (
177
+ '<!DOCTYPE html>\n<html lang="en">\n<head><meta charset="UTF-8">'
178
+ '<meta name="viewport" content="width=device-width, initial-scale=1.0"></head>\n'
179
+ + text[bd:].strip()
180
+ )
181
+ return text.strip()
182
+
183
+
184
+ class QueueDiffusionStreamer(BaseStreamer):
185
+ def __init__(self, tok, q: "queue_lib.Queue"):
186
+ self.tok = tok
187
+ self.q = q
188
+ self.confirmed_ids: list[int] = []
189
+ self.prompt_skipped = False
190
+ self.block = 0
191
+ self.step = 0
192
+
193
+ def _decode(self, ids):
194
+ return self.tok.decode(ids, skip_special_tokens=True)
195
+
196
+ def put(self, value):
197
+ ids = value[0].tolist() if value.dim() > 1 else value.tolist()
198
+ if not self.prompt_skipped:
199
+ self.prompt_skipped = True
200
+ return
201
+ self.confirmed_ids.extend(ids)
202
+ self.block += 1
203
+ self.step = 0
204
+ self.q.put(("commit", self._decode(self.confirmed_ids), self.block, self.step))
205
+
206
+ def put_draft(self, value):
207
+ self.step += 1
208
+ ids = value[0].tolist() if value.dim() > 1 else value.tolist()
209
+ self.q.put(("draft", self._decode(self.confirmed_ids + ids), self.block + 1, self.step))
210
+
211
+ def end(self):
212
+ self.q.put(("end", self._decode(self.confirmed_ids), self.block, self.step))
213
+
214
+
215
+ def build_messages(history_json: str, prompt: str):
216
+ try:
217
+ history = json.loads(history_json) if history_json else []
218
+ except json.JSONDecodeError:
219
+ history = []
220
+ messages = [{"role": "system", "content": SYSTEM_PROMPT}]
221
+ for turn in history:
222
+ role = turn.get("role")
223
+ content = turn.get("content", "")
224
+ if role in ("user", "assistant") and content:
225
+ messages.append({"role": role, "content": content})
226
+ messages.append({"role": "user", "content": prompt})
227
+ return messages
228
+
229
+
230
+ # --------------------------------------------------------------------------- #
231
+ # GPU work — runs in a forked ZeroGPU worker process.
232
+ # Inputs/outputs cross the boundary via pickle, so only CPU tensors / plain
233
+ # Python objects go in and out (no CUDA tensors are returned).
234
+ # --------------------------------------------------------------------------- #
235
+ def _estimate_duration(input_ids, max_new_tokens=2048, max_iters=64, full_denoise=False, canvas_ids=None):
236
+ blocks = max(1, int(max_new_tokens) // max(1, CANVAS_LEN))
237
+ secs = 30 + blocks * int(max_iters) * 0.3
238
+ return int(min(120, secs)) # xlarge internally doubles this for the quota check
239
+
240
+
241
+ @spaces.GPU(duration=_estimate_duration, size=GPU_SIZE)
242
+ def _gpu_stream(input_ids, max_new_tokens, max_iters, full_denoise, canvas_ids):
243
+ input_ids = input_ids.to(model.device)
244
+ gen_kwargs = dict(max_new_tokens=int(max_new_tokens), max_denoising_steps=int(max_iters))
245
+ if full_denoise:
246
+ gen_kwargs["confidence_threshold"] = 1e-9
247
+ gen_kwargs["stability_threshold"] = int(max_iters)
248
+ if canvas_ids is not None:
249
+ gen_kwargs["canvas_ids"] = canvas_ids.to(model.device)
250
+
251
+ q: "queue_lib.Queue" = queue_lib.Queue()
252
+ streamer = QueueDiffusionStreamer(tokenizer, q)
253
+ err = {}
254
+
255
+ def worker():
256
+ try:
257
+ with torch.inference_mode():
258
+ model.generate(input_ids, streamer=streamer, **gen_kwargs)
259
+ except Exception as exc: # surface to the endpoint
260
+ err["msg"] = f"{type(exc).__name__}: {exc}"
261
+ q.put(("error", str(exc), 0, 0))
262
+ finally:
263
+ q.put(("end", "", 0, 0)) # always unblock the consumer
264
+
265
+ thread = threading.Thread(target=worker)
266
+ thread.start()
267
+ try:
268
+ while True:
269
+ kind, text, block, step = q.get()
270
+ if kind == "error":
271
+ yield ("error", err.get("msg", text), 0, 0)
272
+ return
273
+ if kind == "end":
274
+ return
275
+ yield (kind, text, block, step)
276
+ finally:
277
+ thread.join()
278
+
279
+
280
+ # --------------------------------------------------------------------------- #
281
+ # Server
282
+ # --------------------------------------------------------------------------- #
283
+ app = Server(title="Gemma Diffusion Website Builder")
284
+
285
+
286
+ @app.api(name="generate", concurrency_limit=1, time_limit=600, stream_every=0.05)
287
+ def generate(
288
+ prompt: str,
289
+ history_json: str = "[]",
290
+ max_new_tokens: int = 2048,
291
+ max_iters: int = 64,
292
+ full_denoise: bool = False,
293
+ anim_delay: float = 0.0,
294
+ warm_start: bool = True,
295
+ ) -> str:
296
+ """Stream the diffusion generation as JSON frames (one per denoising step).
297
+
298
+ The model writes a self-contained HTML document; the frontend renders it live.
299
+ """
300
+ prompt = (prompt or "").strip()
301
+ if not prompt:
302
+ yield json.dumps({"kind": "error", "message": "Empty prompt."})
303
+ return
304
+
305
+ messages = build_messages(history_json, prompt)
306
+ max_iters = max(1, min(int(max_iters), MAX_ITERS_CAP))
307
+
308
+ # Tweak warm-start: seed the diffusion's first canvas with the previous page's own
309
+ # tokens (native `canvas_ids` API) so the model edits the existing page in place.
310
+ is_tweak = bool(last_assistant_html(history_json))
311
+ canvas_ids = warm_canvas_from_cache() if (warm_start and is_tweak) else None
312
+ warming = canvas_ids is not None
313
+
314
+ input_ids = tokenizer.apply_chat_template(
315
+ messages,
316
+ tokenize=True,
317
+ add_generation_prompt=True,
318
+ return_tensors="pt",
319
+ return_dict=True,
320
+ )["input_ids"]
321
+
322
+ last_text = ""
323
+ for kind, text, block, step in _gpu_stream(
324
+ input_ids, int(max_new_tokens), max_iters, bool(full_denoise), canvas_ids
325
+ ):
326
+ if kind == "error":
327
+ yield json.dumps({"kind": "error", "message": text})
328
+ return
329
+ last_text = text
330
+ yield json.dumps(
331
+ {
332
+ "kind": "draft" if kind == "draft" else "commit",
333
+ "source": clean_text(text),
334
+ "block": block,
335
+ "step": step,
336
+ "canvas": CANVAS_LEN,
337
+ "max_iters": max_iters,
338
+ "warming": warming,
339
+ }
340
+ )
341
+ if anim_delay and kind == "draft":
342
+ _time.sleep(float(anim_delay))
343
+
344
+ final_source = extract_html(last_text)
345
+ # Cache the *cleaned* output so the next tweak warm-starts from a valid header.
346
+ if final_source.strip():
347
+ model._last_clean_html = final_source
348
+ yield json.dumps({"kind": "done", "source": final_source})
349
+
350
+
351
+ @app.get("/", response_class=HTMLResponse)
352
+ async def homepage():
353
+ with open(os.path.join(HERE, "index.html"), "r", encoding="utf-8") as f:
354
+ return f.read()
355
+
356
+
357
+ # HF Spaces' gradio runtime looks for a top-level `demo` (or `app`) to launch.
358
+ demo = app
359
+
360
+ if __name__ == "__main__":
361
+ app.launch(
362
+ server_name=os.environ.get("GDIFF_HOST", "0.0.0.0"),
363
+ server_port=int(os.environ.get("GDIFF_PORT", "7860")),
364
+ show_error=True,
365
+ )
index.html ADDED
@@ -0,0 +1,343 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
+ <title>DiffusionGemma · Live Website Builder</title>
7
+ <style>
8
+ :root {
9
+ --bg: #0b0e14;
10
+ --panel: #121722;
11
+ --panel-2: #0f141d;
12
+ --border: #222a38;
13
+ --text: #e6edf3;
14
+ --muted: #8b97a7;
15
+ --accent: #7c5cff;
16
+ --accent-2: #18c29c;
17
+ --amber: #f5c451;
18
+ --green: #2ea043;
19
+ }
20
+ * { box-sizing: border-box; }
21
+ html, body { margin: 0; height: 100%; }
22
+ body {
23
+ background: radial-gradient(1200px 600px at 70% -10%, #1a2030 0%, var(--bg) 55%);
24
+ color: var(--text);
25
+ font-family: ui-sans-serif, system-ui, -apple-system, "Segoe UI", Roboto, sans-serif;
26
+ display: flex; flex-direction: column; height: 100vh; overflow: hidden;
27
+ }
28
+ header {
29
+ padding: 14px 22px; border-bottom: 1px solid var(--border);
30
+ display: flex; align-items: center; gap: 14px; flex: 0 0 auto;
31
+ }
32
+ header .logo { font-size: 22px; }
33
+ header h1 { font-size: 16px; margin: 0; font-weight: 650; letter-spacing: .2px; }
34
+ header p { margin: 0; color: var(--muted); font-size: 12.5px; }
35
+ .pill {
36
+ margin-left: auto; font-size: 12px; color: var(--muted);
37
+ border: 1px solid var(--border); border-radius: 999px; padding: 5px 12px;
38
+ display: flex; align-items: center; gap: 8px; white-space: nowrap;
39
+ }
40
+ .dot { width: 8px; height: 8px; border-radius: 50%; background: var(--muted); }
41
+ .dot.live { background: var(--amber); box-shadow: 0 0 8px var(--amber); }
42
+ .dot.done { background: var(--green); box-shadow: 0 0 8px var(--green); }
43
+ .dot.err { background: #f85149; box-shadow: 0 0 8px #f85149; }
44
+
45
+ main { flex: 1 1 auto; display: grid; grid-template-columns: 1fr 1fr; gap: 14px; padding: 14px 18px; min-height: 0; }
46
+ .panel { background: var(--panel); border: 1px solid var(--border); border-radius: 12px; display: flex; flex-direction: column; min-height: 0; overflow: hidden; }
47
+ .panel .cap { padding: 9px 14px; border-bottom: 1px solid var(--border); font-size: 12.5px; font-weight: 600; color: var(--muted); display: flex; align-items: center; gap: 8px; }
48
+ .panel .cap .sub { font-weight: 400; color: #5d6b7a; }
49
+
50
+ /* Code / diffusion view */
51
+ #code { flex: 1 1 auto; overflow: auto; margin: 0; padding: 12px 0; font-family: "SF Mono", ui-monospace, "JetBrains Mono", Menlo, Consolas, monospace; font-size: 12px; line-height: 1.55; background: var(--panel-2); }
52
+ #code .ln { padding: 0 14px; white-space: pre-wrap; word-break: break-word; min-height: 1.55em; border-left: 3px solid transparent; }
53
+ #code .ln.live { animation: flash .6s ease-out; background: rgba(245,196,81,.05); }
54
+ #code .ln.diff { background: rgba(46,160,67,.13); border-left-color: var(--green); }
55
+ @keyframes flash {
56
+ 0% { background: rgba(245,196,81,.42); }
57
+ 100% { background: rgba(245,196,81,.05); }
58
+ }
59
+ #code::-webkit-scrollbar, .scroll::-webkit-scrollbar { width: 9px; height: 9px; }
60
+ #code::-webkit-scrollbar-thumb, .scroll::-webkit-scrollbar-thumb { background: #2a3344; border-radius: 8px; }
61
+
62
+ /* Website preview */
63
+ #preview { flex: 1 1 auto; border: 0; width: 100%; background: #fff; }
64
+
65
+ /* Bottom dock */
66
+ footer { flex: 0 0 auto; border-top: 1px solid var(--border); padding: 12px 18px; background: var(--panel-2); }
67
+ .row { display: flex; gap: 12px; align-items: flex-start; }
68
+ textarea#prompt {
69
+ flex: 1 1 auto; resize: none; height: 72px; background: var(--panel); color: var(--text);
70
+ border: 1px solid var(--border); border-radius: 10px; padding: 11px 13px; font-size: 14px; font-family: inherit;
71
+ }
72
+ textarea#prompt:focus { outline: none; border-color: var(--accent); }
73
+ .btns { display: flex; flex-direction: column; gap: 8px; width: 150px; }
74
+ button { font-family: inherit; font-size: 13.5px; font-weight: 600; border-radius: 10px; padding: 9px 12px; cursor: pointer; border: 1px solid var(--border); }
75
+ button.primary { background: linear-gradient(180deg, #8a6bff, #6b48f0); color: #fff; border: 0; }
76
+ button.primary:disabled { opacity: .5; cursor: not-allowed; }
77
+ button.ghost { background: transparent; color: var(--muted); }
78
+ button.ghost:hover:not(:disabled) { color: var(--text); border-color: #36405230; }
79
+ button.ghost:disabled { opacity: .4; cursor: not-allowed; }
80
+
81
+ .meta { display: flex; gap: 16px; align-items: center; flex-wrap: wrap; margin-top: 11px; color: var(--muted); font-size: 12px; }
82
+ .meta label { display: flex; align-items: center; gap: 7px; white-space: nowrap; }
83
+ .meta input[type="range"] { accent-color: var(--accent); width: 120px; }
84
+ .meta input[type="checkbox"] { accent-color: var(--accent); width: 15px; height: 15px; }
85
+ .meta .val { color: var(--text); font-variant-numeric: tabular-nums; min-width: 30px; }
86
+ .chips { display: flex; gap: 8px; flex-wrap: wrap; margin-top: 11px; }
87
+ .chip { font-size: 12px; color: var(--muted); border: 1px solid var(--border); border-radius: 999px; padding: 5px 11px; cursor: pointer; background: transparent; }
88
+ .chip:hover { color: var(--text); border-color: var(--accent); }
89
+ .history { margin-top: 10px; display: flex; gap: 8px; flex-wrap: wrap; max-height: 46px; overflow: auto; }
90
+ .turn { font-size: 11.5px; color: var(--muted); border: 1px solid var(--border); border-radius: 8px; padding: 3px 9px; }
91
+ .turn b { color: var(--accent-2); }
92
+ </style>
93
+ </head>
94
+ <body>
95
+ <header>
96
+ <span class="logo">🌫️→🌐</span>
97
+ <div>
98
+ <h1>DiffusionGemma · Live Website Builder</h1>
99
+ <p>Describe a website and a block-diffusion LLM writes the HTML by denoising — every token updates at once. Watch the raw canvas take shape (left) while it renders into a live page (right), then send follow-up prompts to tweak it.</p>
100
+ </div>
101
+ <span class="pill"><span class="dot" id="statusDot"></span><span id="statusText">idle</span></span>
102
+ </header>
103
+
104
+ <main>
105
+ <section class="panel">
106
+ <div class="cap">🧠 Model's view — diffusion canvas <span class="sub" id="capInfo"></span></div>
107
+ <div id="code" class="scroll"></div>
108
+ </section>
109
+ <section class="panel">
110
+ <div class="cap">🌐 Live website</div>
111
+ <iframe id="preview" sandbox="allow-scripts"></iframe>
112
+ </section>
113
+ </main>
114
+
115
+ <footer>
116
+ <div class="row">
117
+ <textarea id="prompt" placeholder="Describe a website… e.g. 'a landing page for a coffee shop with a hero, menu, and footer' then tweak: 'make the header dark', 'add a contact form'"></textarea>
118
+ <div class="btns">
119
+ <button class="primary" id="buildBtn">Build / Tweak</button>
120
+ <button class="ghost" id="resetBtn">Reset</button>
121
+ </div>
122
+ </div>
123
+ <div class="chips" id="chips"></div>
124
+ <div class="meta">
125
+ <label>tokens <input type="range" id="maxTokens" min="2048" max="4096" step="256" value="2048"><span class="val" id="maxTokensV">2048</span></label>
126
+ <label>iterations/block <input type="range" id="maxIters" min="8" max="120" step="8" value="64"><span class="val" id="maxItersV">64</span></label>
127
+ <label>anim delay <input type="range" id="delay" min="0" max="0.3" step="0.02" value="0"><span class="val" id="delayV">0.0s</span></label>
128
+ <label><input type="checkbox" id="fullDenoise"> run all denoising steps (no early stop)</label>
129
+ <label><input type="checkbox" id="warmStart" checked> tweak in place (diffuse from current page, not noise)</label>
130
+ <span class="history" id="history"></span>
131
+ </div>
132
+ </footer>
133
+
134
+ <script type="module">
135
+ import { Client } from "https://cdn.jsdelivr.net/npm/@gradio/client/dist/index.min.js";
136
+
137
+ const $ = (id) => document.getElementById(id);
138
+ const codeEl = $("code"), preview = $("preview");
139
+ const statusDot = $("statusDot"), statusText = $("statusText"), capInfo = $("capInfo");
140
+ const buildBtn = $("buildBtn"), resetBtn = $("resetBtn"), promptEl = $("prompt");
141
+
142
+ let client = null;
143
+ let busy = false;
144
+ let messages = []; // [{role, content}] confirmed conversation
145
+ let prevFrameLines = []; // lines shown on the previous streaming frame (live churn diff)
146
+ let lastFinalLines = []; // lines of the previous round's final HTML (tweak diff)
147
+ let previewScrollY = 0; // remembered scroll position of the rendered page (see renderPreview)
148
+
149
+ const EXAMPLES = [
150
+ "A bold landing page for a startup called 'Nimbus' that sells AI weather forecasting, with a hero section, three feature cards, and a call-to-action button.",
151
+ "A cozy personal blog homepage with a warm color palette, a header, an about section, and a list of three recent posts.",
152
+ "A neon synthwave portfolio page for a music producer with an animated gradient background.",
153
+ "A clean pricing page with three tiers (Free, Pro, Team), a feature comparison, and a FAQ.",
154
+ "A restaurant landing page with a hero image area, a menu grid, opening hours, and a reservation button.",
155
+ "A sleek product page for wireless headphones with specs, an image placeholder, and an add-to-cart bar.",
156
+ "A dark-mode dashboard mockup with a sidebar, four stat cards, and a placeholder chart.",
157
+ "A playful 'coming soon' page with a countdown vibe, an email signup box, and floating shapes.",
158
+ ];
159
+
160
+ // The rendered page reports its scroll position back to us via postMessage so we can
161
+ // restore it after each re-render (the iframe reloads every frame as the page diffuses,
162
+ // which would otherwise snap the user back to the top mid-stream).
163
+ window.addEventListener("message", (e) => {
164
+ if (e.data && typeof e.data.__gdiffScrollY === "number") previewScrollY = e.data.__gdiffScrollY;
165
+ });
166
+
167
+ // ---------- helpers ----------
168
+ function esc(s) { return s.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;"); }
169
+
170
+ function setStatus(kind, text) {
171
+ statusDot.className = "dot" + (kind ? " " + kind : "");
172
+ statusText.textContent = text;
173
+ }
174
+
175
+ // Longest-common-subsequence over lines -> set of indices in `b` that are unchanged.
176
+ function unchangedSet(a, b) {
177
+ const n = a.length, m = b.length;
178
+ const dp = Array.from({ length: n + 1 }, () => new Int32Array(m + 1));
179
+ for (let i = n - 1; i >= 0; i--)
180
+ for (let j = m - 1; j >= 0; j--)
181
+ dp[i][j] = a[i] === b[j] ? dp[i + 1][j + 1] + 1 : Math.max(dp[i + 1][j], dp[i][j + 1]);
182
+ const keep = new Set();
183
+ let i = 0, j = 0;
184
+ while (i < n && j < m) {
185
+ if (a[i] === b[j]) { keep.add(j); i++; j++; }
186
+ else if (dp[i + 1][j] >= dp[i][j + 1]) i++; else j++;
187
+ }
188
+ return keep;
189
+ }
190
+
191
+ // Render the source with per-line highlight classes.
192
+ function renderCode(source, { liveAgainst = null, diffAgainst = null } = {}) {
193
+ const lines = source.split("\n");
194
+ const liveKeep = liveAgainst ? unchangedSet(liveAgainst, lines) : null;
195
+ const diffKeep = diffAgainst ? unchangedSet(diffAgainst, lines) : null;
196
+ const html = lines.map((ln, idx) => {
197
+ let cls = "ln";
198
+ if (diffKeep && !diffKeep.has(idx)) cls += " diff"; // persistent tweak diff
199
+ else if (liveKeep && !liveKeep.has(idx)) cls += " live"; // transient churn flash
200
+ return `<div class="${cls}">${esc(ln) || "&nbsp;"}</div>`;
201
+ }).join("");
202
+ codeEl.innerHTML = html;
203
+ return lines;
204
+ }
205
+
206
+ // Repair a (possibly front-mangled) document so the preview never breaks. Warm-start
207
+ // diffusion often eats the leading chars of the header ("<!DOCTYPE html>" -> "DOCTYPE>",
208
+ // "<html" -> "<head"); rebuild whatever was eaten by anchoring on the first intact
209
+ // structural tag. Mirrors the server's extract_html.
210
+ function normalizeHtml(src) {
211
+ const lower = src.toLowerCase();
212
+ const dt = lower.indexOf("<!doctype");
213
+ if (dt !== -1) return src.slice(dt);
214
+ const h = lower.indexOf("<html");
215
+ if (h !== -1) return "<!DOCTYPE html>\n" + src.slice(h);
216
+ const hd = lower.indexOf("<head");
217
+ if (hd !== -1) return '<!DOCTYPE html>\n<html lang="en">\n' + src.slice(hd);
218
+ const bd = lower.indexOf("<body");
219
+ if (bd !== -1) return '<!DOCTYPE html>\n<html lang="en">\n<head><meta charset="UTF-8"></head>\n' + src.slice(bd);
220
+ return src;
221
+ }
222
+
223
+ // Render the model's HTML into the live preview, repairing the header and preserving the
224
+ // user's scroll position across re-renders (each frame reloads the iframe). A tiny script
225
+ // is injected before </body>: it restores the remembered scroll and reports new positions.
226
+ function renderPreview(source) {
227
+ if (!source || !source.trim()) { preview.srcdoc = ""; previewScrollY = 0; return; }
228
+ source = normalizeHtml(source);
229
+ const y = Math.round(previewScrollY);
230
+ const inject =
231
+ "<script>(function(){var y=" + y + ";" +
232
+ "function r(){window.scrollTo(0,y);}" +
233
+ "r();requestAnimationFrame(r);window.addEventListener('load',r);" +
234
+ "window.addEventListener('scroll',function(){" +
235
+ "parent.postMessage({__gdiffScrollY:window.scrollY||document.documentElement.scrollTop||0},'*');" +
236
+ "},{passive:true});})();<\/script>";
237
+ const idx = source.toLowerCase().lastIndexOf("</body>");
238
+ preview.srcdoc = idx !== -1 ? source.slice(0, idx) + inject + source.slice(idx) : source + inject;
239
+ }
240
+
241
+ // ---------- generation ----------
242
+ async function ensureClient() {
243
+ if (!client) { setStatus("", "connecting…"); client = await Client.connect(window.location.origin); }
244
+ return client;
245
+ }
246
+
247
+ async function run() {
248
+ if (busy) return;
249
+ const prompt = promptEl.value.trim();
250
+ if (!prompt) { promptEl.focus(); return; }
251
+
252
+ busy = true; buildBtn.disabled = true;
253
+ const isTweak = messages.length > 0;
254
+ setStatus("live", isTweak ? "tweaking…" : "diffusing…");
255
+ prevFrameLines = [];
256
+ if (!isTweak) previewScrollY = 0; // fresh build starts at the top
257
+
258
+ try {
259
+ const c = await ensureClient();
260
+ const payload = {
261
+ prompt,
262
+ history_json: JSON.stringify(messages),
263
+ max_new_tokens: parseInt($("maxTokens").value, 10),
264
+ max_iters: parseInt($("maxIters").value, 10),
265
+ full_denoise: $("fullDenoise").checked,
266
+ anim_delay: parseFloat($("delay").value),
267
+ warm_start: $("warmStart").checked,
268
+ };
269
+
270
+ let finalSource = "";
271
+ const sub = c.submit("/generate", payload);
272
+ for await (const ev of sub) {
273
+ if (ev.type === "data") {
274
+ const frame = JSON.parse(ev.data[0]);
275
+ if (frame.kind === "error") { setStatus("err", "error"); renderCode("/* " + frame.message + " */"); break; }
276
+ if (frame.kind === "done") {
277
+ finalSource = frame.source;
278
+ // Persistent green highlight of what changed vs the previous round.
279
+ prevFrameLines = renderCode(finalSource, { diffAgainst: isTweak ? lastFinalLines : null });
280
+ renderPreview(finalSource);
281
+ setStatus("done", "done");
282
+ continue;
283
+ }
284
+ // draft / commit frame: both the code panel and the live page churn every frame.
285
+ const src = frame.source || "";
286
+ prevFrameLines = renderCode(src, { liveAgainst: prevFrameLines });
287
+ renderPreview(src);
288
+ capInfo.textContent = `block ${frame.block} · step ${frame.step}/${frame.max_iters} · ${frame.canvas} tokens update simultaneously`;
289
+ setStatus("live", `${frame.kind === "draft" ? "diffusing" : "committed"} · block ${frame.block} · step ${frame.step}`);
290
+ } else if (ev.type === "status" && ev.stage === "error") {
291
+ setStatus("err", "error"); break;
292
+ }
293
+ }
294
+
295
+ if (finalSource) {
296
+ messages.push({ role: "user", content: prompt });
297
+ messages.push({ role: "assistant", content: finalSource });
298
+ lastFinalLines = finalSource.split("\n");
299
+ promptEl.value = "";
300
+ renderHistory();
301
+ }
302
+ } catch (e) {
303
+ setStatus("err", "error");
304
+ renderCode("/* connection error: " + (e && e.message ? e.message : e) + " */");
305
+ } finally {
306
+ busy = false; buildBtn.disabled = false;
307
+ }
308
+ }
309
+
310
+ function renderHistory() {
311
+ const turns = messages.filter((m) => m.role === "user");
312
+ $("history").innerHTML = turns.map((m, i) =>
313
+ `<span class="turn"><b>${i === 0 ? "build" : "tweak " + i}</b> · ${esc(m.content.slice(0, 40))}${m.content.length > 40 ? "…" : ""}</span>`
314
+ ).join("");
315
+ }
316
+
317
+ function reset() {
318
+ messages = []; prevFrameLines = []; lastFinalLines = []; previewScrollY = 0;
319
+ codeEl.innerHTML = ""; renderPreview(""); $("history").innerHTML = "";
320
+ capInfo.textContent = ""; setStatus("", "idle"); promptEl.value = "";
321
+ }
322
+
323
+ // ---------- wiring ----------
324
+ buildBtn.addEventListener("click", run);
325
+ resetBtn.addEventListener("click", reset);
326
+ promptEl.addEventListener("keydown", (e) => {
327
+ if (e.key === "Enter" && (e.metaKey || e.ctrlKey)) { e.preventDefault(); run(); }
328
+ });
329
+ for (const [id, fmt] of [["maxTokens", (v) => v], ["maxIters", (v) => v], ["delay", (v) => (+v).toFixed(1) + "s"]]) {
330
+ const el = $(id), out = $(id + "V");
331
+ el.addEventListener("input", () => (out.textContent = fmt(el.value)));
332
+ }
333
+ $("chips").innerHTML = EXAMPLES.map((e, i) => `<button class="chip" data-i="${i}">${esc(e.slice(0, 42))}…</button>`).join("");
334
+ $("chips").addEventListener("click", (e) => {
335
+ const i = e.target.getAttribute("data-i");
336
+ if (i !== null) { promptEl.value = EXAMPLES[+i]; promptEl.focus(); }
337
+ });
338
+
339
+ renderPreview("");
340
+ setStatus("", "idle");
341
+ </script>
342
+ </body>
343
+ </html>
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ # NOTE: the custom DiffusionGemma `transformers` wheel is bundled in this repo and
2
+ # installed at runtime from app.py — Spaces installs requirements.txt *before* the repo
3
+ # files are copied in, so a local-path wheel reference here can't be found at build time.
4
+ accelerate
5
+ sentencepiece
6
+ hf_xet
transformers-5.8.0.dev0+2db78a1296-py3-none-any.whl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd08fd82069ff1d3a1b49fbb103564295f20429d2ffb68a11cc4f760bb993f9b
3
+ size 11875344