Spaces:

tejasashinde
/

HackathonSpaceRecommender

Running

App Files Files Community

HackathonSpaceRecommender / data /org_readmes /build-small-hackathon__CodeFlow /README.md

tejasashinde

Initial commit

1e34d32 6 days ago

preview code

Raw

History Blame Contribute Delete

12.9 kB

	---
	title: CodeFlow
	emoji: 📊
	colorFrom: indigo
	colorTo: blue
	sdk: gradio
	python_version: '3.13'
	sdk_version: 6.16.0
	app_file: app.py
	pinned: true
	license: mit
	short_description: Turn code into a readable Mermaid.js flowchart 📊!
	tags:
	- track:backyard
	- achievement:offgrid
	- achievement:sharing
	- achievement:offbrand
	- achievement:llama
	- achievement:fieldnotes
	- achievement:welltuned
	- build-small-hackathon
	- backyard-ai
	- llama-cpp
	- field-notes
	- sharing-is-caring
	- off-brand
	- off-the-grid
	- code
	- mermaid.js
	- flowchart
	- small-models
	- seq2seq
	- gradio
	- agentic
	---

	# 📊 CodeFlow

	Paste code → read its logic as a flowchart. A 30B coder model runs entirely on CPU via llama.cpp to translate source code into a clean, animated [Mermaid.js](https://mermaid.js.org/) control-flow diagram — with each node wired back to the exact lines it came from.

	### 🔗 Links

	[🚀 Live Space][space] · [▶️ Demo Video][video] · [🐦 Social Post][social] · [📓 Field Notes (blog)][blog] · [🔍 Agent Traces][traces] · [🎛️ Fine-Tuned Model][model]

	[space]: https://huggingface.co/spaces/build-small-hackathon/CodeFlow "Hugging Face Space"
	[video]: https://youtu.be/R5GbpN9FVxo "Demo video"
	[social]: https://www.linkedin.com/feed/update/urn:li:share:7471327684539785217/ "Social post"
	[blog]: https://huggingface.co/blog/build-small-hackathon/codeflow-field-notes "Field notes / blog post"
	[traces]: https://huggingface.co/datasets/build-small-hackathon/codeflow-agent-traces "Agent traces dataset"
	[model]: https://huggingface.co/build-small-hackathon/codeflow-qwen-3-finetuning "Fine-tuned model"

	---

	## ❓ The Problem

	Reading unfamiliar code means simulating its control flow in your head — chasing branches, loops, and early returns line by line. That's slow, error-prone, and gets worse the deeper the nesting. Existing "code → diagram" tools are usually rigid AST parsers (brittle, language-locked) or cloud LLM APIs (your code leaves the building).

	CodeFlow turns any snippet into a scannable flowchart you can audit at a glance — generated by a real language model that runs 100% locally, so nothing is sent to an external API.

	## ⚙️ How It Works

	```
	Paste code ──▶ Generate ──▶ POST /generate_flowchart (Gradio API)
	│
	number the source lines + structured system prompt
	│
	CodeFlow fine-tune of Qwen3-Coder-30B-A3B (llama.cpp · CPU)
	│
	<thinking> …reasoning… </thinking>
	graph TD … nodes & edges …
	<linemap> A:1 B:2 C:3-4 </linemap>
	│
	strip reasoning · parse + validate the line-map · sanitize labels
	│
	{ mermaid, linemap } ──▶ append agent_traces.jsonl
	│
	Mermaid render + "trace-the-path" reveal + node ↔ code linking
	```

	1. You paste code (or pick a pre-rendered example) into the CodeMirror editor and hit Generate.
	2. The backend numbers the source lines and sends them with a strict system prompt to the CodeFlow fine-tune of Qwen3-Coder running on llama.cpp.
	3. The model returns hidden `<thinking>`, the Mermaid `graph`, and a `<linemap>` mapping every node to its source line(s).
	4. The server strips the reasoning, validates the line-map against the source, sanitizes labels for Mermaid, and returns `{ mermaid, linemap }`.
	5. The frontend renders the diagram with a trace-the-path reveal that flows out of a persistent Start node while the canvas scrolls along in real time.
	6. Node ↔ code linking: hover a node to highlight its source lines, click a node to jump-and-edit them, or move your cursor over a line to light up the matching node.
	7. Every generation is captured as a structured agent trace (`/traces`).

	## 🎛️ Fine-Tuning

	CodeFlow runs a [LoRA fine-tune][model] of Qwen3-Coder-30B-A3B-Instruct (≈30.5B params), specialized for the code → Mermaid + `<linemap>` task rather than relying on the base model's general coding ability.

	- Data: 2,400 synthetic examples (2,208 train / 192 val — 8% holdout), built from 22 control-flow templates across Python, JavaScript, C++, and C.
	- Method: LoRA `r=16, α=32` on the attention + MLP projections, bf16, cosine schedule — then merged and exported to a Q3_K_L GGUF for CPU inference.
	- Validation: the holdout is hard-validated — generated outputs are syntax-checked / compiled, not just eyeballed.

	See the [model card][model] for the full data engine, `finetune.py` options, and dataset preview.

	## 🧰 Tech Stack

	\| Layer \| What it is \| Used for \|
	\|---\|---\|---\|
	\| Model \| [CodeFlow fine-tune][model] of [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen) (Mixture-of-Experts) \| Code → Mermaid + line-map generation \|
	\| Fine-tuning \| LoRA SFT (`r=16, α=32`) on attention + MLP projections, merged to GGUF \| Specializes the base model for the code → Mermaid + line-map task \|
	\| Quantization \| Q3_K_L GGUF (~3-bit) \| Shrinks the 30B model to run on CPU \|
	\| Inference \| [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (llama.cpp) \| Local CPU inference (`n_ctx=4096`) \|
	\| Model fetch \| `huggingface_hub` \| Downloads the GGUF on first run \|
	\| Server \| [Gradio](https://www.gradio.app/) `gr.Server` + FastAPI \| `/generate_flowchart` API, `/` UI, `/traces` \|
	\| Frontend \| A single self-contained `frontend.html` (vanilla JS + CSS custom properties) \| Editor, diagram, animation, theming \|
	\| Editor \| [CodeMirror 6](https://codemirror.net/) — vendored bundle (`static/cm.bundle.js`) \| Syntax-highlighted code input \|
	\| Diagrams \| [Mermaid.js 10](https://mermaid.js.org/) — vendored UMD (`static/mermaid.min.js`) \| Flowchart rendering \|
	\| Animation \| Web Animations API \| Trace-the-path reveal + theme crossfade \|
	\| Type \| Fraunces · Hanken Grotesk · JetBrains Mono — vendored woff2 (`static/fonts/`) \| Custom, non-default look \|
	\| Assets \| All JS/CSS/fonts bundled into `static/` (no CDN at runtime) \| True offline operation \|
	\| Observability \| Hand-rolled JSONL agent traces \| One trace per generation, served at `/traces` \|
	\| Tests \| `smoke-test.sh` (headless Chrome) \| 13 build/render checks \|
	\| Deploy \| Hugging Face Spaces \| Hosting \|

	## 🔢 Total Parameters

	CodeFlow is driven by a [LoRA fine-tune][model] of Qwen3-Coder-30B-A3B-Instruct — a Mixture-of-Experts model with:

	- ≈ 30.5 billion total parameters (well under the 32B cap)
	- ≈ 3.3 billion active parameters per token (128 experts, 8 activated)

	It's served as a ~3-bit (Q3_K_L) GGUF, which compresses those 30B weights to a CPU-runnable footprint (~13 GB on disk) — letting a 30B-class model generate diagrams off the grid, with no GPU and no external API.

	## 🏅 Badges (6 / 6)

	These map to the Space tags above.

	\| Badge \| How CodeFlow earns it \|
	\|---\|---\|
	\| 🔌 Off the Grid \| No external API or CDN at runtime — period. The model runs fully locally (Qwen3-Coder GGUF on CPU via llama.cpp), and every frontend asset (Mermaid, CodeMirror, the Gradio client, all fonts) is vendored into `static/`. The Gradio share tunnel is off (`share=False`). The only network call in the whole project is the one-time model download at startup. The UI even runs fully offline from `file://`. \|
	\| 🎨 Off-Brand \| Zero default-Gradio look. A bespoke single-file UI: custom "Pine & Sage" palette (one-word rust fallback), Fraunces + Hanken Grotesk type, a hand-drawn decision-node logo, restyled Mermaid nodes, and a trace-the-path reveal animation — deliberately designed not to look templated. \|
	\| 📓 Field Notes \| See the [blog post][blog]. \|
	\| 🤝 Sharing is Caring \| Open-source under MIT, a public Space, plus a [social post][social] sharing the process and learnings. \|
	\| 🤖 Agentic \| Every model generation is captured as a structured agent trace (input code, the model's reasoning, output, token usage, latency), downloadable at [`/traces`][traces]. \|
	\| 🎛️ Well-Tuned \| A [LoRA fine-tune][model] of Qwen3-Coder-30B-A3B-Instruct (≈30.5B params — under the 32B cap), specialized for the code → Mermaid + `<linemap>` task and shipped as the GGUF the Space actually runs. \|

	## 🎥 Demo

	▶️ [Watch the demo video][video] — a full walkthrough of CodeFlow in action.

	## 💻 Run It Locally

	> First launch downloads the ~13 GB GGUF from Hugging Face. CPU inference is slow (cold generations can take minutes) — the built-in examples render instantly because their diagrams are pre-computed.

	```bash
	# 1. Clone
	git clone https://huggingface.co/spaces/build-small-hackathon/CodeFlow CodeFlow
	cd CodeFlow

	# 2. Create a virtual env
	python -m venv .venv
	source .venv/bin/activate # Windows: .venv\Scripts\activate

	# 3. Install deps (uses a prebuilt CPU wheel for llama-cpp-python)
	pip install -r requirements.txt

	# 4. Run — opens a local Gradio URL
	python app.py
	```

	Then open the printed URL. Preview the UI without the model by opening `frontend.html` directly in a browser (`file://`) — fully offline, since all assets are vendored in `static/`; the example presets render their diagrams instantly.

	> Rebuilding the vendored bundles (optional): the CodeMirror + Gradio-client bundles in `static/` are produced by `build/build.sh` (needs Node). Mermaid and the fonts are downloaded into `static/` as well. You never need this to run the app — only to regenerate the bundles.

	Endpoints: `/` (UI) · `/generate_flowchart` (API) · `/traces` (download all agent traces as JSONL).

	## 🗂️ Repository Structure

	```
	CodeFlow/
	├── app.py # Gradio + FastAPI server: loads the model and exposes
	│ # /generate_flowchart (API), / (UI), /static, /traces
	├── frontend.html # Self-contained UI — CodeMirror editor, Mermaid render,
	│ # trace-the-path animation, node↔code linking, theming
	├── static/ # Vendored frontend assets — NO CDN at runtime
	│ ├── mermaid.min.js # Mermaid (UMD, ~3.2 MB)
	│ ├── cm.bundle.js # CodeMirror 6 (single IIFE bundle)
	│ ├── gradio-client.js # @gradio/client (IIFE bundle)
	│ ├── fonts.css # @font-face → local woff2
	│ └── fonts/ # Fraunces · Hanken Grotesk · JetBrains Mono (woff2)
	├── build/ # Reproducible bundle build (Node) — build.sh + entry files
	├── requirements.txt # Python deps (CPU llama-cpp-python wheel, gradio, hub)
	├── smoke-test.sh # Headless-Chrome smoke test (13 checks)
	├── notes-for-blog.md # Field Notes — the full build log
	├── README.md # You are here
	└── LICENSE # MIT
	```

	## ⚠️ Limitations

	- CPU inference is slow. A 30B model on CPU means cold generations can take minutes; the demo leans on pre-rendered examples for instant feedback.
	- 3-bit quantization trades some fidelity for the ability to run a 30B model at all — occasional imperfect diagrams.
	- 4096-token context — very large files won't fit; works best on functions/snippets.
	- Line-map depends on the model. The `<linemap>` is LLM-generated; the server validates and drops bad entries, so node↔code links can be partial on tricky code.
	- Paraphrased labels. Nodes describe logic in plain words (no raw code), so they read cleanly but aren't verbatim.
	- Mermaid parse failures on unusual syntax are possible (the raw output is shown so nothing is lost).
	- Ephemeral traces on Spaces. `agent_traces.jsonl` lives on the runtime filesystem and resets on restart/rebuild — download it before then.

	## 🙏 Credits

	- Model: [CodeFlow fine-tune][model] of [Qwen3-Coder](https://huggingface.co/Qwen) (Qwen Team, Alibaba), built with [Unsloth](https://huggingface.co/unsloth).
	- Inference: [llama.cpp](https://github.com/ggml-org/llama.cpp) via [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) (Andrei Betlen).
	- App framework: [Gradio](https://www.gradio.app/) (Hugging Face).
	- Diagrams: [Mermaid.js](https://mermaid.js.org/) · Editor: [CodeMirror](https://codemirror.net/).
	- Type: Fraunces, Hanken Grotesk, JetBrains Mono ([Google Fonts](https://fonts.google.com/), SIL OFL).
	- Built for the Build Small Hackathon.

	## 📄 License

	Released under the MIT License — see [`LICENSE`](LICENSE). © 2026 Rishi Jain.