Spaces:
Running
Running
File size: 11,890 Bytes
9ac80bb 2d635f4 43c8912 a036b27 9ac80bb 1433b16 9ac80bb 1433b16 9ac80bb 1433b16 6a55f00 626c044 e4a872b 1433b16 cf2d7d2 49b5a13 2a4077f cf2d7d2 3dffa87 1433b16 9ac80bb 1433b16 2a4077f 1433b16 5054925 1433b16 5de5a84 1433b16 42bbd85 1433b16 42bbd85 1433b16 42bbd85 1433b16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 | ---
title: CodeFlow
emoji: π
colorFrom: indigo
colorTo: blue
sdk: gradio
python_version: '3.13'
sdk_version: 6.16.0
app_file: app.py
pinned: true
license: mit
short_description: Turn code into a readable Mermaid.js flowchart π!
tags:
- track:backyard
- achievement:offgrid
- achievement:sharing
- achievement:offbrand
- achievement:llama
- achievement:fieldnotes
- build-small-hackathon
- backyard-ai
- llama-cpp
- field-notes
- sharing-is-caring
- off-brand
- off-the-grid
- code
- mermaid.js
- flowchart
- small-models
- seq2seq
- gradio
- agentic
---
# π CodeFlow
**Paste code β read its logic as a flowchart.** A 30B coder model runs entirely on **CPU via llama.cpp** to translate source code into a clean, animated [Mermaid.js](https://mermaid.js.org/) control-flow diagram β with each node wired back to the exact lines it came from.
### π Links
[π **Live Space**][space] Β· [βΆοΈ **Demo Video**][video] Β· [π¦ **Social Post**][social] Β· [π **Field Notes (blog)**][blog] Β· [π **Agent Traces**][traces]
<!-- βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FILL THESE IN β replace each REPLACE_ME with your real URL. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ -->
[space]: https://huggingface.co/spaces/build-small-hackathon/CodeFlow "Hugging Face Space"
[video]: https://youtu.be/R5GbpN9FVxo "Demo video"
[social]: https://www.linkedin.com/feed/update/urn:li:share:7471327684539785217/ "Social post"
[blog]: https://huggingface.co/blog/build-small-hackathon/codeflow-field-notes "Field notes / blog post"
[traces]: https://huggingface.co/datasets/build-small-hackathon/codeflow-agent-traces "Agent traces dataset"
---
## β The Problem
Reading unfamiliar code means simulating its control flow in your head β chasing branches, loops, and early returns line by line. That's slow, error-prone, and gets worse the deeper the nesting. Existing "code β diagram" tools are usually rigid AST parsers (brittle, language-locked) or cloud LLM APIs (your code leaves the building).
**CodeFlow** turns any snippet into a scannable flowchart you can audit at a glance β generated by a real language model that runs **100% locally**, so nothing is sent to an external API.
## βοΈ How It Works
```
Paste code βββΆ Generate βββΆ POST /generate_flowchart (Gradio API)
β
number the source lines + structured system prompt
β
Qwen3-Coder-30B-A3B (llama.cpp Β· CPU)
β
<thinking> β¦reasoningβ¦ </thinking>
graph TD β¦ nodes & edges β¦
<linemap> A:1 B:2 C:3-4 </linemap>
β
strip reasoning Β· parse + validate the line-map Β· sanitize labels
β
{ mermaid, linemap } βββΆ append agent_traces.jsonl
β
Mermaid render + "trace-the-path" reveal + node β code linking
```
1. You paste code (or pick a pre-rendered example) into the **CodeMirror** editor and hit **Generate**.
2. The backend numbers the source lines and sends them with a strict system prompt to **Qwen3-Coder** running on **llama.cpp**.
3. The model returns hidden `<thinking>`, the Mermaid `graph`, and a `<linemap>` mapping every node to its source line(s).
4. The server strips the reasoning, **validates** the line-map against the source, sanitizes labels for Mermaid, and returns `{ mermaid, linemap }`.
5. The frontend renders the diagram with a **trace-the-path reveal** that flows out of a persistent Start node while the canvas scrolls along in real time.
6. **Node β code linking:** hover a node to highlight its source lines, click a node to jump-and-edit them, or move your cursor over a line to light up the matching node.
7. Every generation is captured as a structured **agent trace** (`/traces`).
## π§° Tech Stack
| Layer | What it is | Used for |
|---|---|---|
| **Model** | [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen) (Mixture-of-Experts) | Code β Mermaid + line-map generation |
| **Quantization** | [Unsloth](https://huggingface.co/unsloth) Dynamic **UD-Q3_K_XL** GGUF (~3-bit) | Shrinks the 30B model to run on CPU |
| **Inference** | [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (llama.cpp) | Local CPU inference (`n_ctx=4096`) |
| **Model fetch** | `huggingface_hub` | Downloads the GGUF on first run |
| **Server** | [Gradio](https://www.gradio.app/) `gr.Server` + FastAPI | `/generate_flowchart` API, `/` UI, `/traces` |
| **Frontend** | A single self-contained `frontend.html` (vanilla JS + CSS custom properties) | Editor, diagram, animation, theming |
| **Editor** | [CodeMirror 6](https://codemirror.net/) β **vendored** bundle (`static/cm.bundle.js`) | Syntax-highlighted code input |
| **Diagrams** | [Mermaid.js 10](https://mermaid.js.org/) β **vendored** UMD (`static/mermaid.min.js`) | Flowchart rendering |
| **Animation** | Web Animations API | Trace-the-path reveal + theme crossfade |
| **Type** | Fraunces Β· Hanken Grotesk Β· JetBrains Mono β **vendored** woff2 (`static/fonts/`) | Custom, non-default look |
| **Assets** | All JS/CSS/fonts bundled into `static/` (no CDN at runtime) | True offline operation |
| **Observability** | Hand-rolled JSONL agent traces | One trace per generation, served at `/traces` |
| **Tests** | `smoke-test.sh` (headless Chrome) | 13 build/render checks |
| **Deploy** | Hugging Face Spaces | Hosting |
## π’ Total Parameters
CodeFlow is driven by **Qwen3-Coder-30B-A3B-Instruct** β a **Mixture-of-Experts** model with:
- **β 30.5 billion total parameters**
- **β 3.3 billion active parameters per token** (128 experts, 8 activated)
It's served as an **Unsloth Dynamic ~3-bit (UD-Q3_K_XL) GGUF**, which compresses those 30B weights to a CPU-runnable footprint (~13 GB on disk) β letting a 30B-class model generate diagrams **off the grid**, with no GPU and no external API.
## π
Badges (5 / 6)
These map to the Space tags above.
| Badge | How CodeFlow earns it |
|---|---|
| π **Off the Grid** | **No external API or CDN at runtime β period.** The model runs fully locally (Qwen3-Coder GGUF on CPU via llama.cpp), and *every* frontend asset (Mermaid, CodeMirror, the Gradio client, all fonts) is vendored into `static/`. The Gradio share tunnel is off (`share=False`). The **only** network call in the whole project is the one-time model download at startup. The UI even runs fully offline from `file://`. |
| π¨ **Off-Brand** | **Zero default-Gradio look.** A bespoke single-file UI: custom "Pine & Sage" palette (one-word rust fallback), Fraunces + Hanken Grotesk type, a hand-drawn decision-node logo, restyled Mermaid nodes, and a trace-the-path reveal animation β deliberately designed *not* to look templated. |
| π **Field Notes** | See the [blog post][blog]. |
| π€ **Sharing is Caring** | Open-source under **MIT**, a public Space, plus a [social post][social] sharing the process and learnings. |
| π€ **Agentic** | Every model generation is captured as a structured agent trace (input code, the model's reasoning, output, token usage, latency), downloadable at [`/traces`][traces]. |
## π₯ Demo
βΆοΈ **[Watch the demo video][video]** β a full walkthrough of CodeFlow in action.
## π» Run It Locally
> First launch downloads the **~13 GB GGUF** from Hugging Face. CPU inference is slow (cold generations can take minutes) β the built-in **examples render instantly** because their diagrams are pre-computed.
```bash
# 1. Clone
git clone https://huggingface.co/spaces/build-small-hackathon/CodeFlow CodeFlow
cd CodeFlow
# 2. Create a virtual env
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 3. Install deps (uses a prebuilt CPU wheel for llama-cpp-python)
pip install -r requirements.txt
# 4. Run β opens a local Gradio URL
python app.py
```
Then open the printed URL. **Preview the UI without the model** by opening `frontend.html` directly in a browser (`file://`) β fully offline, since all assets are vendored in `static/`; the example presets render their diagrams instantly.
> **Rebuilding the vendored bundles** (optional): the CodeMirror + Gradio-client bundles in `static/` are produced by `build/build.sh` (needs Node). Mermaid and the fonts are downloaded into `static/` as well. You never need this to *run* the app β only to regenerate the bundles.
**Endpoints:** `/` (UI) Β· `/generate_flowchart` (API) Β· `/traces` (download all agent traces as JSONL).
## ποΈ Repository Structure
```
CodeFlow/
βββ app.py # Gradio + FastAPI server: loads the model and exposes
β # /generate_flowchart (API), / (UI), /static, /traces
βββ frontend.html # Self-contained UI β CodeMirror editor, Mermaid render,
β # trace-the-path animation, nodeβcode linking, theming
βββ static/ # Vendored frontend assets β NO CDN at runtime
β βββ mermaid.min.js # Mermaid (UMD, ~3.2 MB)
β βββ cm.bundle.js # CodeMirror 6 (single IIFE bundle)
β βββ gradio-client.js # @gradio/client (IIFE bundle)
β βββ fonts.css # @font-face β local woff2
β βββ fonts/ # Fraunces Β· Hanken Grotesk Β· JetBrains Mono (woff2)
βββ build/ # Reproducible bundle build (Node) β build.sh + entry files
βββ requirements.txt # Python deps (CPU llama-cpp-python wheel, gradio, hub)
βββ smoke-test.sh # Headless-Chrome smoke test (13 checks)
βββ notes-for-blog.md # Field Notes β the full build log
βββ README.md # You are here
βββ LICENSE # MIT
```
## β οΈ Limitations
- **CPU inference is slow.** A 30B model on CPU means cold generations can take minutes; the demo leans on pre-rendered examples for instant feedback.
- **3-bit quantization** trades some fidelity for the ability to run a 30B model at all β occasional imperfect diagrams.
- **4096-token context** β very large files won't fit; works best on functions/snippets.
- **Line-map depends on the model.** The `<linemap>` is LLM-generated; the server validates and drops bad entries, so nodeβcode links can be partial on tricky code.
- **Paraphrased labels.** Nodes describe logic in plain words (no raw code), so they read cleanly but aren't verbatim.
- **Mermaid parse failures** on unusual syntax are possible (the raw output is shown so nothing is lost).
- **Ephemeral traces on Spaces.** `agent_traces.jsonl` lives on the runtime filesystem and resets on restart/rebuild β download it before then.
## π Credits
- **Model:** [Qwen3-Coder](https://huggingface.co/Qwen) (Qwen Team, Alibaba) β GGUF quant by [Unsloth](https://huggingface.co/unsloth).
- **Inference:** [llama.cpp](https://github.com/ggml-org/llama.cpp) via [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (Andrei Betlen).
- **App framework:** [Gradio](https://www.gradio.app/) (Hugging Face).
- **Diagrams:** [Mermaid.js](https://mermaid.js.org/) Β· **Editor:** [CodeMirror](https://codemirror.net/).
- **Type:** Fraunces, Hanken Grotesk, JetBrains Mono ([Google Fonts](https://fonts.google.com/), SIL OFL).
- **Built for** the Build Small Hackathon.
## π License
Released under the **MIT License** β see [`LICENSE`](LICENSE). Β© 2026 Rishi Jain.
|