Spaces:

build-small-hackathon
/

split-brain-copilot

Running

App Files Files Community

blessingmwiti commited on 10 days ago

Commit

9f9873b

1 Parent(s): 8ad53dd

Polish Space card and split-brain UI

Browse files

Files changed (4) hide show

README.md +45 -9
app.py +31 -6
static/style.css +138 -22
static/ui.js +10 -3

README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 title: Split-Brain Co-Pilot
 emoji: ⚡
 colorFrom: blue
-colorTo: indigo
 sdk: gradio
 sdk_version: 5.30.0
 app_file: app.py
@@ -11,21 +11,47 @@ license: apache-2.0
 tags:
   - code-generation
   - webgpu
-  - speculative-decoding
   - llama.cpp
   - local-first
 ---
 # Split-Brain Co-Pilot
-A speculative coding assistant for the Build Small Hackathon: a 1.5B code model drafts locally in Chrome with WebGPU, while a 14B Qwen verifier on Modal checks the result in the background. When the verifier catches a problem, the UI flashes, rolls back, and types in the corrected cloud block.
 ## Architecture
-- Local brain: `onnx-community/Qwen2.5-Coder-1.5B-Instruct` through transformers.js `3.5.x`, WebGPU, Q4 weights.
 - Cloud brain: `bartowski/Qwen2.5-Coder-14B-Instruct-GGUF` (`Qwen2.5-Coder-14B-Instruct-Q4_K_M.gguf`) served on Modal A10G through llama.cpp.
 - Shell: Gradio 5 Space with a custom HTML/CSS/JS streaming surface.
-- Optional proof step: Modal sandbox execution endpoint for generated Python code.
 ## Requirements
@@ -82,11 +108,21 @@ This project uses `modal==1.4.3`; older `0.73.x` clients are now rejected by Mod
 Prompt idea: "Write a Python function that finds all prime numbers up to n using a segmented sieve, handling edge cases."
-Show the model loading bar, token streaming, verifier status, rollback animation on a FIX/REWRITE verdict, and the final verified state.
 ## Badge Targets
-- Off the Grid: local 1.5B browser inference.
-- Llama Champion: 14B llama.cpp verifier on Modal.
-- Off-Brand: custom UI, rollback flash, status bar, token counter.
 - Field Notes: publish a post-build architecture writeup and link it here.

 title: Split-Brain Co-Pilot
 emoji: ⚡
 colorFrom: blue
+colorTo: green
 sdk: gradio
 sdk_version: 5.30.0
 app_file: app.py
 tags:
   - code-generation
   - webgpu
+  - small-models
   - llama.cpp
+  - modal
   - local-first
+  - transformers.js
 ---
 # Split-Brain Co-Pilot
+A small-model coding assistant for the Build Small Hackathon. A 1.5B code model drafts instantly inside Chrome with WebGPU, while a 14B Qwen verifier on Modal checks the draft in the background. When the verifier catches a problem, the UI flashes, rolls back, and types in the corrected cloud block live.
+The result is a split-brain workflow: fast local generation first, slower cloud verification second, and a sandbox proof step when the final answer is Python.
+Try it in Chrome 113+ on desktop: load the local model, enter a coding prompt, generate, verify, then run the sandbox.
+## Why it fits Build Small
+This project is built for the **An Adventure in Thousand Token Wood** track: the AI behavior is the experience. The fun part is not just that it writes code, but that two small models disagree, verify, and visibly reconcile their answers.
+- **Small models only:** `Qwen2.5-Coder-1.5B` + `Qwen2.5-Coder-14B-Instruct` = **15.5B total parameters**, under the 32B cap.
+- **Built on Gradio:** the app is a Gradio Space with a custom HTML/CSS/JS surface.
+- **Show, don't tell:** token streaming, verifier state, rollback animation, and sandbox output are all visible in the app.
+- **Modal-powered:** the 14B verifier runs on Modal A10G and the Python sandbox runs as a Modal endpoint.
 ## Architecture
+- Local brain: `onnx-community/Qwen2.5-Coder-1.5B-Instruct` through transformers.js `3.5.x`, WebGPU, quantized browser weights.
 - Cloud brain: `bartowski/Qwen2.5-Coder-14B-Instruct-GGUF` (`Qwen2.5-Coder-14B-Instruct-Q4_K_M.gguf`) served on Modal A10G through llama.cpp.
 - Shell: Gradio 5 Space with a custom HTML/CSS/JS streaming surface.
+- Proof step: Modal sandbox execution endpoint for generated Python code.
+```mermaid
+flowchart LR
+    Prompt["User prompt"] --> Local["1.5B browser model<br/>WebGPU + transformers.js"]
+    Local --> Draft["Streaming draft code"]
+    Draft --> Verify["14B Modal verifier<br/>llama.cpp on A10G"]
+    Verify -->|PASS| Final["Verified code"]
+    Verify -->|FIX / REWRITE| Rollback["Rollback animation<br/>corrected block"]
+    Rollback --> Final
+    Final --> Sandbox["Python sandbox proof<br/>Modal Sandbox"]
+```
 ## Requirements
 Prompt idea: "Write a Python function that finds all prime numbers up to n using a segmented sieve, handling edge cases."
+Show the model loading bar, token streaming, verifier status, rollback animation on a FIX/REWRITE verdict, and the final verified state. Then click **Run Python Sandbox** so the demo ends with executable proof, not just generated text.
 ## Badge Targets
+- Llama Champion: 14B verifier served through llama.cpp.
+- Off-Brand: custom split-brain UI, rollback flash, status rail, token counter, and sandbox output.
 - Field Notes: publish a post-build architecture writeup and link it here.
+- Modal Awards: verifier and sandbox are both Modal-powered.
+The app is **local-first**, but not fully Off the Grid: the draft model runs in-browser, while verification intentionally uses Modal.
+## Current Status
+- HF Space: live under the `build-small-hackathon` org.
+- Local model: browser WebGPU loading works with quantized weights.
+- Verifier: Modal endpoint deployed.
+- Sandbox: Modal Python execution endpoint deployed.
+- Remaining submission work: demo video, social post, and Field Notes writeup.

app.py CHANGED Viewed

@@ -43,9 +43,22 @@ def endpoint_url(url: str | None, path: str) -> str | None:
 custom_html = f"""
 <div id="split-brain-root">
-    <div class="split-topline">
-        <span>Local: WebGPU 1.5B</span>
-        <span>Cloud: Modal A10G 14B</span>
     </div>
     <div class="webgpu-notice" id="webgpu-warning" hidden>
         WebGPU not detected. Use Chrome 113+ on desktop for local inference.
@@ -55,7 +68,13 @@ custom_html = f"""
         <div class="loading-bar"><div class="loading-bar-fill" id="load-progress"></div></div>
         <span id="load-status" class="load-status">Model not loaded</span>
     </div>
-    <pre id="stream-display" class="code-stream">Waiting for model load...</pre>
     <div class="status-bar">
         <span id="status-text">Idle</span>
         <span id="token-count">0 tok/s</span>
@@ -342,9 +361,15 @@ with gr.Blocks(
     gr.HTML(
         """
         <section class="app-header">
-            <p class="eyebrow">Build Small Hackathon</p>
             <h1>Split-Brain Co-Pilot</h1>
-            <p>Draft locally in Chrome with a 1.5B WebGPU model. Verify in the background with a 14B Modal brain.</p>
         </section>
         <div class="space-init" id="space-init">Space initializing...</div>
         <script>

 custom_html = f"""
 <div id="split-brain-root">
+    <div class="brain-rail" aria-label="Split-brain architecture">
+        <div class="brain-node local-node">
+            <span class="brain-label">Local Draft</span>
+            <strong>WebGPU 1.5B</strong>
+            <small>fast browser stream</small>
+        </div>
+        <div class="brain-pulse" aria-hidden="true">
+            <span></span>
+            <span></span>
+            <span></span>
+        </div>
+        <div class="brain-node cloud-node">
+            <span class="brain-label">Cloud Check</span>
+            <strong>Modal A10G 14B</strong>
+            <small>llama.cpp verifier</small>
+        </div>
     </div>
     <div class="webgpu-notice" id="webgpu-warning" hidden>
         WebGPU not detected. Use Chrome 113+ on desktop for local inference.
         <div class="loading-bar"><div class="loading-bar-fill" id="load-progress"></div></div>
         <span id="load-status" class="load-status">Model not loaded</span>
     </div>
+    <div class="stream-shell">
+        <div class="stream-toolbar">
+            <span>Speculative draft</span>
+            <span id="stream-phase">Idle</span>
+        </div>
+        <pre id="stream-display" class="code-stream">Waiting for model load...</pre>
+    </div>
     <div class="status-bar">
         <span id="status-text">Idle</span>
         <span id="token-count">0 tok/s</span>
     gr.HTML(
         """
         <section class="app-header">
+            <p class="eyebrow">Build Small Hackathon · 15.5B parameters total</p>
             <h1>Split-Brain Co-Pilot</h1>
+            <p>One small model drafts in your browser. Another small model checks it on Modal. The UI shows the handoff, verdict, rollback, and executable proof.</p>
+            <div class="badge-row" aria-label="Project badges">
+                <span>WebGPU local-first</span>
+                <span>llama.cpp verifier</span>
+                <span>Modal sandbox</span>
+                <span>Custom Gradio UI</span>
+            </div>
         </section>
         <div class="space-init" id="space-init">Space initializing...</div>
         <script>

static/style.css CHANGED Viewed

@@ -1,14 +1,17 @@
 :root {
-    --bg: #0d1117;
-    --surface: #161b22;
-    --surface-2: #0f1720;
-    --border: #30363d;
-    --accent: #58a6ff;
-    --accent-warn: #f0883e;
     --text: #e6edf3;
-    --text-muted: #8b949e;
-    --green: #3fb950;
-    --red: #f85149;
 }
 body,
@@ -27,8 +30,9 @@ footer {
 }
 .app-header {
-    margin: 0 auto 20px;
-    max-width: 980px;
     text-align: center;
 }
@@ -37,20 +41,40 @@ footer {
     color: var(--text);
     font-size: clamp(32px, 6vw, 56px);
     letter-spacing: 0;
 }
 .app-header p {
-    margin: 0;
     color: var(--text-muted);
 }
 .app-header .eyebrow {
-    color: var(--accent);
     font-size: 12px;
     font-weight: 700;
     text-transform: uppercase;
 }
 .space-init {
     margin: 0 auto 12px;
     max-width: 980px;
@@ -62,17 +86,84 @@ footer {
     color: var(--text);
 }
-.split-topline,
 .status-bar {
     display: flex;
     gap: 12px;
     justify-content: space-between;
 }
-.split-topline {
-    margin-bottom: 10px;
     color: var(--text-muted);
-    font-size: 12px;
 }
 .webgpu-notice {
@@ -88,7 +179,7 @@ footer {
     grid-template-columns: auto 1fr;
     gap: 10px 12px;
     align-items: center;
-    margin-bottom: 12px;
 }
 .load-section.loaded {
@@ -121,7 +212,7 @@ footer {
 .loading-bar-fill {
     width: 0%;
     height: 100%;
-    background: var(--accent);
     transition: width 0.3s ease;
 }
@@ -131,14 +222,34 @@ footer {
     font-size: 12px;
 }
 .code-stream {
     box-sizing: border-box;
     min-height: 390px;
     max-height: 58vh;
     margin: 0;
     overflow: auto;
-    border: 1px solid var(--border);
-    border-radius: 8px 8px 0 0;
     background: var(--surface);
     color: var(--text);
     font-family: "JetBrains Mono", ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
@@ -171,6 +282,7 @@ footer {
     color: var(--text-muted);
     font-size: 12px;
     padding: 10px 12px;
 }
 .status-warning,
@@ -201,12 +313,16 @@ button {
         grid-template-columns: 1fr;
     }
-    .split-topline,
     .status-bar {
         align-items: flex-start;
         flex-direction: column;
     }
     .code-stream {
         min-height: 320px;
     }

 :root {
+    --bg: #090d12;
+    --surface: #151922;
+    --surface-2: #101721;
+    --surface-3: #1b2430;
+    --border: #323b48;
+    --accent: #62a8ff;
+    --accent-2: #48d597;
+    --accent-warn: #e8a04f;
     --text: #e6edf3;
+    --text-muted: #9aa6b2;
+    --green: #48d597;
+    --red: #ff6565;
+    --shadow: rgba(0, 0, 0, 0.34);
 }
 body,
 }
 .app-header {
+    margin: 0 auto 22px;
+    max-width: 1060px;
+    padding: 8px 12px 0;
     text-align: center;
 }
     color: var(--text);
     font-size: clamp(32px, 6vw, 56px);
     letter-spacing: 0;
+    line-height: 1.02;
 }
 .app-header p {
+    margin: 0 auto;
+    max-width: 820px;
     color: var(--text-muted);
+    line-height: 1.55;
 }
 .app-header .eyebrow {
+    color: var(--accent-2);
     font-size: 12px;
     font-weight: 700;
     text-transform: uppercase;
 }
+.badge-row {
+    display: flex;
+    flex-wrap: wrap;
+    gap: 8px;
+    justify-content: center;
+    margin-top: 16px;
+}
+.badge-row span {
+    border: 1px solid var(--border);
+    border-radius: 999px;
+    background: rgba(255, 255, 255, 0.035);
+    color: var(--text);
+    font-size: 12px;
+    padding: 7px 10px;
+}
 .space-init {
     margin: 0 auto 12px;
     max-width: 980px;
     color: var(--text);
 }
+.brain-rail,
 .status-bar {
     display: flex;
     gap: 12px;
     justify-content: space-between;
 }
+.brain-rail {
+    align-items: stretch;
+    margin-bottom: 14px;
+}
+.brain-node {
+    flex: 1 1 0;
+    min-width: 0;
+    border: 1px solid var(--border);
+    border-radius: 8px;
+    background: var(--surface);
+    box-shadow: 0 14px 34px var(--shadow);
+    padding: 12px;
+}
+.brain-node strong,
+.brain-node small,
+.brain-label {
+    display: block;
+}
+.brain-node strong {
+    margin: 4px 0;
+    color: var(--text);
+    font-size: 15px;
+}
+.brain-node small,
+.brain-label {
     color: var(--text-muted);
+    font-size: 11px;
+}
+.brain-label {
+    font-weight: 700;
+    text-transform: uppercase;
+}
+.local-node {
+    border-left: 3px solid var(--accent);
+}
+.cloud-node {
+    border-left: 3px solid var(--accent-2);
+}
+.brain-pulse {
+    display: grid;
+    flex: 0 0 72px;
+    grid-template-columns: repeat(3, 1fr);
+    gap: 6px;
+    place-items: center;
+}
+.brain-pulse span {
+    width: 9px;
+    height: 9px;
+    border-radius: 999px;
+    background: var(--accent-warn);
+    box-shadow: 0 0 18px rgba(232, 160, 79, 0.48);
+    opacity: 0.82;
+}
+.brain-pulse span:nth-child(2) {
+    background: var(--accent);
+    box-shadow: 0 0 18px rgba(98, 168, 255, 0.5);
+}
+.brain-pulse span:nth-child(3) {
+    background: var(--accent-2);
+    box-shadow: 0 0 18px rgba(72, 213, 151, 0.5);
 }
 .webgpu-notice {
     grid-template-columns: auto 1fr;
     gap: 10px 12px;
     align-items: center;
+    margin-bottom: 14px;
 }
 .load-section.loaded {
 .loading-bar-fill {
     width: 0%;
     height: 100%;
+    background: linear-gradient(90deg, var(--accent), var(--accent-2));
     transition: width 0.3s ease;
 }
     font-size: 12px;
 }
+.stream-shell {
+    overflow: hidden;
+    border: 1px solid var(--border);
+    border-radius: 8px 8px 0 0;
+    background: var(--surface);
+    box-shadow: 0 16px 38px var(--shadow);
+}
+.stream-toolbar {
+    display: flex;
+    gap: 12px;
+    align-items: center;
+    justify-content: space-between;
+    border-bottom: 1px solid var(--border);
+    background: var(--surface-3);
+    color: var(--text-muted);
+    font-size: 12px;
+    padding: 9px 12px;
+}
 .code-stream {
     box-sizing: border-box;
     min-height: 390px;
     max-height: 58vh;
     margin: 0;
     overflow: auto;
+    border: 0;
+    border-radius: 0;
     background: var(--surface);
     color: var(--text);
     font-family: "JetBrains Mono", ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
     color: var(--text-muted);
     font-size: 12px;
     padding: 10px 12px;
+    box-shadow: 0 16px 38px var(--shadow);
 }
 .status-warning,
         grid-template-columns: 1fr;
     }
+    .brain-rail,
     .status-bar {
         align-items: flex-start;
         flex-direction: column;
     }
+    .brain-pulse {
+        display: none;
+    }
     .code-stream {
         min-height: 320px;
     }

static/ui.js CHANGED Viewed

@@ -14,10 +14,17 @@ export function appendToken(token) {
 export function setStatus(text, type = "neutral") {
     const el = document.getElementById("status-text");
-    if (!el) return;
-    el.textContent = text;
-    el.className = `status-${type}`;
 }
 export function setVerifierStatus(verdict) {

 export function setStatus(text, type = "neutral") {
     const el = document.getElementById("status-text");
+    const phase = document.getElementById("stream-phase");
+    if (el) {
+        el.textContent = text;
+        el.className = `status-${type}`;
+    }
+    if (phase) {
+        phase.textContent = text;
+        phase.className = `status-${type}`;
+    }
 }
 export function setVerifierStatus(verdict) {