blessingmwiti commited on
Commit
9f9873b
·
1 Parent(s): 8ad53dd

Polish Space card and split-brain UI

Browse files
Files changed (4) hide show
  1. README.md +45 -9
  2. app.py +31 -6
  3. static/style.css +138 -22
  4. static/ui.js +10 -3
README.md CHANGED
@@ -2,7 +2,7 @@
2
  title: Split-Brain Co-Pilot
3
  emoji: ⚡
4
  colorFrom: blue
5
- colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 5.30.0
8
  app_file: app.py
@@ -11,21 +11,47 @@ license: apache-2.0
11
  tags:
12
  - code-generation
13
  - webgpu
14
- - speculative-decoding
15
  - llama.cpp
 
16
  - local-first
 
17
  ---
18
 
19
  # Split-Brain Co-Pilot
20
 
21
- A speculative coding assistant for the Build Small Hackathon: a 1.5B code model drafts locally in Chrome with WebGPU, while a 14B Qwen verifier on Modal checks the result in the background. When the verifier catches a problem, the UI flashes, rolls back, and types in the corrected cloud block.
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ## Architecture
24
 
25
- - Local brain: `onnx-community/Qwen2.5-Coder-1.5B-Instruct` through transformers.js `3.5.x`, WebGPU, Q4 weights.
26
  - Cloud brain: `bartowski/Qwen2.5-Coder-14B-Instruct-GGUF` (`Qwen2.5-Coder-14B-Instruct-Q4_K_M.gguf`) served on Modal A10G through llama.cpp.
27
  - Shell: Gradio 5 Space with a custom HTML/CSS/JS streaming surface.
28
- - Optional proof step: Modal sandbox execution endpoint for generated Python code.
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ## Requirements
31
 
@@ -82,11 +108,21 @@ This project uses `modal==1.4.3`; older `0.73.x` clients are now rejected by Mod
82
 
83
  Prompt idea: "Write a Python function that finds all prime numbers up to n using a segmented sieve, handling edge cases."
84
 
85
- Show the model loading bar, token streaming, verifier status, rollback animation on a FIX/REWRITE verdict, and the final verified state.
86
 
87
  ## Badge Targets
88
 
89
- - Off the Grid: local 1.5B browser inference.
90
- - Llama Champion: 14B llama.cpp verifier on Modal.
91
- - Off-Brand: custom UI, rollback flash, status bar, token counter.
92
  - Field Notes: publish a post-build architecture writeup and link it here.
 
 
 
 
 
 
 
 
 
 
 
 
2
  title: Split-Brain Co-Pilot
3
  emoji: ⚡
4
  colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
  sdk_version: 5.30.0
8
  app_file: app.py
 
11
  tags:
12
  - code-generation
13
  - webgpu
14
+ - small-models
15
  - llama.cpp
16
+ - modal
17
  - local-first
18
+ - transformers.js
19
  ---
20
 
21
  # Split-Brain Co-Pilot
22
 
23
+ A small-model coding assistant for the Build Small Hackathon. A 1.5B code model drafts instantly inside Chrome with WebGPU, while a 14B Qwen verifier on Modal checks the draft in the background. When the verifier catches a problem, the UI flashes, rolls back, and types in the corrected cloud block live.
24
+
25
+ The result is a split-brain workflow: fast local generation first, slower cloud verification second, and a sandbox proof step when the final answer is Python.
26
+
27
+ Try it in Chrome 113+ on desktop: load the local model, enter a coding prompt, generate, verify, then run the sandbox.
28
+
29
+ ## Why it fits Build Small
30
+
31
+ This project is built for the **An Adventure in Thousand Token Wood** track: the AI behavior is the experience. The fun part is not just that it writes code, but that two small models disagree, verify, and visibly reconcile their answers.
32
+
33
+ - **Small models only:** `Qwen2.5-Coder-1.5B` + `Qwen2.5-Coder-14B-Instruct` = **15.5B total parameters**, under the 32B cap.
34
+ - **Built on Gradio:** the app is a Gradio Space with a custom HTML/CSS/JS surface.
35
+ - **Show, don't tell:** token streaming, verifier state, rollback animation, and sandbox output are all visible in the app.
36
+ - **Modal-powered:** the 14B verifier runs on Modal A10G and the Python sandbox runs as a Modal endpoint.
37
 
38
  ## Architecture
39
 
40
+ - Local brain: `onnx-community/Qwen2.5-Coder-1.5B-Instruct` through transformers.js `3.5.x`, WebGPU, quantized browser weights.
41
  - Cloud brain: `bartowski/Qwen2.5-Coder-14B-Instruct-GGUF` (`Qwen2.5-Coder-14B-Instruct-Q4_K_M.gguf`) served on Modal A10G through llama.cpp.
42
  - Shell: Gradio 5 Space with a custom HTML/CSS/JS streaming surface.
43
+ - Proof step: Modal sandbox execution endpoint for generated Python code.
44
+
45
+ ```mermaid
46
+ flowchart LR
47
+ Prompt["User prompt"] --> Local["1.5B browser model<br/>WebGPU + transformers.js"]
48
+ Local --> Draft["Streaming draft code"]
49
+ Draft --> Verify["14B Modal verifier<br/>llama.cpp on A10G"]
50
+ Verify -->|PASS| Final["Verified code"]
51
+ Verify -->|FIX / REWRITE| Rollback["Rollback animation<br/>corrected block"]
52
+ Rollback --> Final
53
+ Final --> Sandbox["Python sandbox proof<br/>Modal Sandbox"]
54
+ ```
55
 
56
  ## Requirements
57
 
 
108
 
109
  Prompt idea: "Write a Python function that finds all prime numbers up to n using a segmented sieve, handling edge cases."
110
 
111
+ Show the model loading bar, token streaming, verifier status, rollback animation on a FIX/REWRITE verdict, and the final verified state. Then click **Run Python Sandbox** so the demo ends with executable proof, not just generated text.
112
 
113
  ## Badge Targets
114
 
115
+ - Llama Champion: 14B verifier served through llama.cpp.
116
+ - Off-Brand: custom split-brain UI, rollback flash, status rail, token counter, and sandbox output.
 
117
  - Field Notes: publish a post-build architecture writeup and link it here.
118
+ - Modal Awards: verifier and sandbox are both Modal-powered.
119
+
120
+ The app is **local-first**, but not fully Off the Grid: the draft model runs in-browser, while verification intentionally uses Modal.
121
+
122
+ ## Current Status
123
+
124
+ - HF Space: live under the `build-small-hackathon` org.
125
+ - Local model: browser WebGPU loading works with quantized weights.
126
+ - Verifier: Modal endpoint deployed.
127
+ - Sandbox: Modal Python execution endpoint deployed.
128
+ - Remaining submission work: demo video, social post, and Field Notes writeup.
app.py CHANGED
@@ -43,9 +43,22 @@ def endpoint_url(url: str | None, path: str) -> str | None:
43
 
44
  custom_html = f"""
45
  <div id="split-brain-root">
46
- <div class="split-topline">
47
- <span>Local: WebGPU 1.5B</span>
48
- <span>Cloud: Modal A10G 14B</span>
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  </div>
50
  <div class="webgpu-notice" id="webgpu-warning" hidden>
51
  WebGPU not detected. Use Chrome 113+ on desktop for local inference.
@@ -55,7 +68,13 @@ custom_html = f"""
55
  <div class="loading-bar"><div class="loading-bar-fill" id="load-progress"></div></div>
56
  <span id="load-status" class="load-status">Model not loaded</span>
57
  </div>
58
- <pre id="stream-display" class="code-stream">Waiting for model load...</pre>
 
 
 
 
 
 
59
  <div class="status-bar">
60
  <span id="status-text">Idle</span>
61
  <span id="token-count">0 tok/s</span>
@@ -342,9 +361,15 @@ with gr.Blocks(
342
  gr.HTML(
343
  """
344
  <section class="app-header">
345
- <p class="eyebrow">Build Small Hackathon</p>
346
  <h1>Split-Brain Co-Pilot</h1>
347
- <p>Draft locally in Chrome with a 1.5B WebGPU model. Verify in the background with a 14B Modal brain.</p>
 
 
 
 
 
 
348
  </section>
349
  <div class="space-init" id="space-init">Space initializing...</div>
350
  <script>
 
43
 
44
  custom_html = f"""
45
  <div id="split-brain-root">
46
+ <div class="brain-rail" aria-label="Split-brain architecture">
47
+ <div class="brain-node local-node">
48
+ <span class="brain-label">Local Draft</span>
49
+ <strong>WebGPU 1.5B</strong>
50
+ <small>fast browser stream</small>
51
+ </div>
52
+ <div class="brain-pulse" aria-hidden="true">
53
+ <span></span>
54
+ <span></span>
55
+ <span></span>
56
+ </div>
57
+ <div class="brain-node cloud-node">
58
+ <span class="brain-label">Cloud Check</span>
59
+ <strong>Modal A10G 14B</strong>
60
+ <small>llama.cpp verifier</small>
61
+ </div>
62
  </div>
63
  <div class="webgpu-notice" id="webgpu-warning" hidden>
64
  WebGPU not detected. Use Chrome 113+ on desktop for local inference.
 
68
  <div class="loading-bar"><div class="loading-bar-fill" id="load-progress"></div></div>
69
  <span id="load-status" class="load-status">Model not loaded</span>
70
  </div>
71
+ <div class="stream-shell">
72
+ <div class="stream-toolbar">
73
+ <span>Speculative draft</span>
74
+ <span id="stream-phase">Idle</span>
75
+ </div>
76
+ <pre id="stream-display" class="code-stream">Waiting for model load...</pre>
77
+ </div>
78
  <div class="status-bar">
79
  <span id="status-text">Idle</span>
80
  <span id="token-count">0 tok/s</span>
 
361
  gr.HTML(
362
  """
363
  <section class="app-header">
364
+ <p class="eyebrow">Build Small Hackathon · 15.5B parameters total</p>
365
  <h1>Split-Brain Co-Pilot</h1>
366
+ <p>One small model drafts in your browser. Another small model checks it on Modal. The UI shows the handoff, verdict, rollback, and executable proof.</p>
367
+ <div class="badge-row" aria-label="Project badges">
368
+ <span>WebGPU local-first</span>
369
+ <span>llama.cpp verifier</span>
370
+ <span>Modal sandbox</span>
371
+ <span>Custom Gradio UI</span>
372
+ </div>
373
  </section>
374
  <div class="space-init" id="space-init">Space initializing...</div>
375
  <script>
static/style.css CHANGED
@@ -1,14 +1,17 @@
1
  :root {
2
- --bg: #0d1117;
3
- --surface: #161b22;
4
- --surface-2: #0f1720;
5
- --border: #30363d;
6
- --accent: #58a6ff;
7
- --accent-warn: #f0883e;
 
 
8
  --text: #e6edf3;
9
- --text-muted: #8b949e;
10
- --green: #3fb950;
11
- --red: #f85149;
 
12
  }
13
 
14
  body,
@@ -27,8 +30,9 @@ footer {
27
  }
28
 
29
  .app-header {
30
- margin: 0 auto 20px;
31
- max-width: 980px;
 
32
  text-align: center;
33
  }
34
 
@@ -37,20 +41,40 @@ footer {
37
  color: var(--text);
38
  font-size: clamp(32px, 6vw, 56px);
39
  letter-spacing: 0;
 
40
  }
41
 
42
  .app-header p {
43
- margin: 0;
 
44
  color: var(--text-muted);
 
45
  }
46
 
47
  .app-header .eyebrow {
48
- color: var(--accent);
49
  font-size: 12px;
50
  font-weight: 700;
51
  text-transform: uppercase;
52
  }
53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  .space-init {
55
  margin: 0 auto 12px;
56
  max-width: 980px;
@@ -62,17 +86,84 @@ footer {
62
  color: var(--text);
63
  }
64
 
65
- .split-topline,
66
  .status-bar {
67
  display: flex;
68
  gap: 12px;
69
  justify-content: space-between;
70
  }
71
 
72
- .split-topline {
73
- margin-bottom: 10px;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  color: var(--text-muted);
75
- font-size: 12px;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  }
77
 
78
  .webgpu-notice {
@@ -88,7 +179,7 @@ footer {
88
  grid-template-columns: auto 1fr;
89
  gap: 10px 12px;
90
  align-items: center;
91
- margin-bottom: 12px;
92
  }
93
 
94
  .load-section.loaded {
@@ -121,7 +212,7 @@ footer {
121
  .loading-bar-fill {
122
  width: 0%;
123
  height: 100%;
124
- background: var(--accent);
125
  transition: width 0.3s ease;
126
  }
127
 
@@ -131,14 +222,34 @@ footer {
131
  font-size: 12px;
132
  }
133
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
  .code-stream {
135
  box-sizing: border-box;
136
  min-height: 390px;
137
  max-height: 58vh;
138
  margin: 0;
139
  overflow: auto;
140
- border: 1px solid var(--border);
141
- border-radius: 8px 8px 0 0;
142
  background: var(--surface);
143
  color: var(--text);
144
  font-family: "JetBrains Mono", ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
@@ -171,6 +282,7 @@ footer {
171
  color: var(--text-muted);
172
  font-size: 12px;
173
  padding: 10px 12px;
 
174
  }
175
 
176
  .status-warning,
@@ -201,12 +313,16 @@ button {
201
  grid-template-columns: 1fr;
202
  }
203
 
204
- .split-topline,
205
  .status-bar {
206
  align-items: flex-start;
207
  flex-direction: column;
208
  }
209
 
 
 
 
 
210
  .code-stream {
211
  min-height: 320px;
212
  }
 
1
  :root {
2
+ --bg: #090d12;
3
+ --surface: #151922;
4
+ --surface-2: #101721;
5
+ --surface-3: #1b2430;
6
+ --border: #323b48;
7
+ --accent: #62a8ff;
8
+ --accent-2: #48d597;
9
+ --accent-warn: #e8a04f;
10
  --text: #e6edf3;
11
+ --text-muted: #9aa6b2;
12
+ --green: #48d597;
13
+ --red: #ff6565;
14
+ --shadow: rgba(0, 0, 0, 0.34);
15
  }
16
 
17
  body,
 
30
  }
31
 
32
  .app-header {
33
+ margin: 0 auto 22px;
34
+ max-width: 1060px;
35
+ padding: 8px 12px 0;
36
  text-align: center;
37
  }
38
 
 
41
  color: var(--text);
42
  font-size: clamp(32px, 6vw, 56px);
43
  letter-spacing: 0;
44
+ line-height: 1.02;
45
  }
46
 
47
  .app-header p {
48
+ margin: 0 auto;
49
+ max-width: 820px;
50
  color: var(--text-muted);
51
+ line-height: 1.55;
52
  }
53
 
54
  .app-header .eyebrow {
55
+ color: var(--accent-2);
56
  font-size: 12px;
57
  font-weight: 700;
58
  text-transform: uppercase;
59
  }
60
 
61
+ .badge-row {
62
+ display: flex;
63
+ flex-wrap: wrap;
64
+ gap: 8px;
65
+ justify-content: center;
66
+ margin-top: 16px;
67
+ }
68
+
69
+ .badge-row span {
70
+ border: 1px solid var(--border);
71
+ border-radius: 999px;
72
+ background: rgba(255, 255, 255, 0.035);
73
+ color: var(--text);
74
+ font-size: 12px;
75
+ padding: 7px 10px;
76
+ }
77
+
78
  .space-init {
79
  margin: 0 auto 12px;
80
  max-width: 980px;
 
86
  color: var(--text);
87
  }
88
 
89
+ .brain-rail,
90
  .status-bar {
91
  display: flex;
92
  gap: 12px;
93
  justify-content: space-between;
94
  }
95
 
96
+ .brain-rail {
97
+ align-items: stretch;
98
+ margin-bottom: 14px;
99
+ }
100
+
101
+ .brain-node {
102
+ flex: 1 1 0;
103
+ min-width: 0;
104
+ border: 1px solid var(--border);
105
+ border-radius: 8px;
106
+ background: var(--surface);
107
+ box-shadow: 0 14px 34px var(--shadow);
108
+ padding: 12px;
109
+ }
110
+
111
+ .brain-node strong,
112
+ .brain-node small,
113
+ .brain-label {
114
+ display: block;
115
+ }
116
+
117
+ .brain-node strong {
118
+ margin: 4px 0;
119
+ color: var(--text);
120
+ font-size: 15px;
121
+ }
122
+
123
+ .brain-node small,
124
+ .brain-label {
125
  color: var(--text-muted);
126
+ font-size: 11px;
127
+ }
128
+
129
+ .brain-label {
130
+ font-weight: 700;
131
+ text-transform: uppercase;
132
+ }
133
+
134
+ .local-node {
135
+ border-left: 3px solid var(--accent);
136
+ }
137
+
138
+ .cloud-node {
139
+ border-left: 3px solid var(--accent-2);
140
+ }
141
+
142
+ .brain-pulse {
143
+ display: grid;
144
+ flex: 0 0 72px;
145
+ grid-template-columns: repeat(3, 1fr);
146
+ gap: 6px;
147
+ place-items: center;
148
+ }
149
+
150
+ .brain-pulse span {
151
+ width: 9px;
152
+ height: 9px;
153
+ border-radius: 999px;
154
+ background: var(--accent-warn);
155
+ box-shadow: 0 0 18px rgba(232, 160, 79, 0.48);
156
+ opacity: 0.82;
157
+ }
158
+
159
+ .brain-pulse span:nth-child(2) {
160
+ background: var(--accent);
161
+ box-shadow: 0 0 18px rgba(98, 168, 255, 0.5);
162
+ }
163
+
164
+ .brain-pulse span:nth-child(3) {
165
+ background: var(--accent-2);
166
+ box-shadow: 0 0 18px rgba(72, 213, 151, 0.5);
167
  }
168
 
169
  .webgpu-notice {
 
179
  grid-template-columns: auto 1fr;
180
  gap: 10px 12px;
181
  align-items: center;
182
+ margin-bottom: 14px;
183
  }
184
 
185
  .load-section.loaded {
 
212
  .loading-bar-fill {
213
  width: 0%;
214
  height: 100%;
215
+ background: linear-gradient(90deg, var(--accent), var(--accent-2));
216
  transition: width 0.3s ease;
217
  }
218
 
 
222
  font-size: 12px;
223
  }
224
 
225
+ .stream-shell {
226
+ overflow: hidden;
227
+ border: 1px solid var(--border);
228
+ border-radius: 8px 8px 0 0;
229
+ background: var(--surface);
230
+ box-shadow: 0 16px 38px var(--shadow);
231
+ }
232
+
233
+ .stream-toolbar {
234
+ display: flex;
235
+ gap: 12px;
236
+ align-items: center;
237
+ justify-content: space-between;
238
+ border-bottom: 1px solid var(--border);
239
+ background: var(--surface-3);
240
+ color: var(--text-muted);
241
+ font-size: 12px;
242
+ padding: 9px 12px;
243
+ }
244
+
245
  .code-stream {
246
  box-sizing: border-box;
247
  min-height: 390px;
248
  max-height: 58vh;
249
  margin: 0;
250
  overflow: auto;
251
+ border: 0;
252
+ border-radius: 0;
253
  background: var(--surface);
254
  color: var(--text);
255
  font-family: "JetBrains Mono", ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
 
282
  color: var(--text-muted);
283
  font-size: 12px;
284
  padding: 10px 12px;
285
+ box-shadow: 0 16px 38px var(--shadow);
286
  }
287
 
288
  .status-warning,
 
313
  grid-template-columns: 1fr;
314
  }
315
 
316
+ .brain-rail,
317
  .status-bar {
318
  align-items: flex-start;
319
  flex-direction: column;
320
  }
321
 
322
+ .brain-pulse {
323
+ display: none;
324
+ }
325
+
326
  .code-stream {
327
  min-height: 320px;
328
  }
static/ui.js CHANGED
@@ -14,10 +14,17 @@ export function appendToken(token) {
14
 
15
  export function setStatus(text, type = "neutral") {
16
  const el = document.getElementById("status-text");
17
- if (!el) return;
 
 
 
 
 
18
 
19
- el.textContent = text;
20
- el.className = `status-${type}`;
 
 
21
  }
22
 
23
  export function setVerifierStatus(verdict) {
 
14
 
15
  export function setStatus(text, type = "neutral") {
16
  const el = document.getElementById("status-text");
17
+ const phase = document.getElementById("stream-phase");
18
+
19
+ if (el) {
20
+ el.textContent = text;
21
+ el.className = `status-${type}`;
22
+ }
23
 
24
+ if (phase) {
25
+ phase.textContent = text;
26
+ phase.className = `status-${type}`;
27
+ }
28
  }
29
 
30
  export function setVerifierStatus(verdict) {