Spaces:
Paused
Paused
| title: smolcode | |
| emoji: π€ | |
| colorFrom: purple | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 5.50.0 | |
| python_version: "3.12" | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: A tiny local model that writes code, runs it, and fixes it. | |
| tags: | |
| - build-small-hackathon | |
| - agent | |
| - code-generation | |
| - gradio | |
| # smolcode π€ | |
| **A tiny local model that writes code, runs it, and fixes it β until it works.** | |
| smolcode is an *agentic* coding assistant built for **small** language models. Instead of | |
| autocompleting, it runs a **plan β write β execute β repair** loop: it writes a file, runs | |
| it in a sandbox, reads the real error, and iterates until a test passes β on a model small | |
| enough to run on your own machine (a β€4B model on a laptop, scaling up to 32B on a | |
| workstation). **No cloud APIs.** | |
| Built for the [Hugging Face Γ Gradio **Build Small** Hackathon](https://huggingface.co/build-small-hackathon). | |
| ## Why it's a "Build Small" entry | |
| - **Agentic on a 3B model.** The loop β not the model size β does the work. A β€4B model | |
| drives tool calls reliably enough to write, run, and self-correct code. | |
| - **Local-first & private.** Talks to any OpenAI-compatible endpoint (Ollama, llama.cpp). | |
| Nothing leaves your machine. | |
| - **Specialty routing.** A 2D router classifies tasks into 16 language/function | |
| families and escalates within each family's fine-tuned ladder before falling back | |
| to bigger Granite models. | |
| - **Fine-tuned tiny coder.** We fine-tuned **Qwen2.5-Coder-1.5B** to emit native tool calls | |
| so a β€2B model can be the cheap entry tier β published at | |
| [`seanpoyner/smolcode-coder-1.5b-tools`](https://huggingface.co/seanpoyner/smolcode-coder-1.5b-tools). | |
| - **Rust core.** Agent loop, tool execution, and tracing run through | |
| [**LiteForge**](https://github.com/seanpoyner/liteforge) and **smolcode-core** | |
| (Rust/PyO3). Gradio is the (required) shell; the brain is Rust. | |
| ## How to use this Space | |
| 1. Type a coding task, e.g. *"write a function that validates an email and test it."* | |
| 2. Watch the **agent trace** stream live: `write_file β run_python β (error) β fix β pass`. | |
| 3. The **router** badge shows which tier solved it and whether it's **β verified**. | |
| 4. Tick **β‘ fan out** and enter several lines to run independent tasks as **parallel subagents**. | |
| ## Benchmark β the loop is the product | |
| The agentic loop is what makes a tiny model useful. On the same HumanEval-style suite | |
| (`bench/tasks.py`, 10 tasks, pass@1): | |
| <!-- BENCH_TABLE_START --> | |
| | System | Model | pass@1 | | |
| |--------|-------|--------| | |
| | single-shot | fine-tuned **1.5B** | 50% | | |
| | **agentic loop** | fine-tuned **1.5B** | **70%** | | |
| | single-shot | granite4.1:3b | 90% | | |
| *The writeβrunβfix loop lifts the fine-tuned 1.5B from **50% β 70%** (+20 pts) β the | |
| loop, not raw model size, does the work. A larger model (granite 3B) scores higher | |
| single-shot, which is exactly why the router escalates only when the small tier can't | |
| verify. Measured with `bench/run.py` on the hal backend.* | |
| <!-- BENCH_TABLE_END --> | |
| ## Under the hood | |
| ``` | |
| Gradio UI β smolcode-core / LiteForge (Rust/PyO3) β OpenAI-compatible endpoint | |
| specialty router + agent loop | |
| tools: write_file, read_file, run_python, run_tests | |
| served by Ollama / llama.cpp (local, HAL LAN, or public Modal+Ollama) | |
| ``` | |
| The public demo serves the whole specialist matrix + Granite ladder from one | |
| Modal container running Ollama, so the specialty router escalates for real in the | |
| cloud β same engine, just an endpoint change. See | |
| [SPACE_DEPLOY.md](SPACE_DEPLOY.md) option (c). | |
| There's also a full terminal agent (`smolcode-cli`, a Rust ratatui TUI) and a | |
| Replit/Lovable-style app builder (`smolbuilder.py`) on the same engine. | |
| - **Code:** https://github.com/seanpoyner/smolcode | |
| - **Model:** https://huggingface.co/seanpoyner/smolcode-coder-1.5b-tools | |
| - **Engine:** https://github.com/seanpoyner/liteforge | |
| - **App builder companion:** https://huggingface.co/spaces/seanpoyner/smolbuilder | |
| ## Demo video | |
| <video controls src="https://huggingface.co/spaces/seanpoyner/smolcode/resolve/main/demo.mp4"></video> | |
| [βΆοΈ Watch the demo](https://huggingface.co/spaces/seanpoyner/smolcode/resolve/main/demo.mp4) β the agent writes code, runs it, fixes the failing test, and shows the router tier that solved it. | |
| ## Share | |
| > Most coding tasks don't need a giant model. **smolcode** is an agentic coding agent that runs entirely on a *small local model* β it writes the code, runs it, reads the real error, and fixes itself until tests pass. Fine-tuned **1.5B** coder; the router escalates a tier only when needed (all β€32B). Less compute, same result. | |
| > | |
| > Built for the #BuildSmall hackathon with @huggingface + @Gradio. π¦ Rust core. | |
| > βΆοΈ https://huggingface.co/spaces/seanpoyner/smolcode | |
| > #SmallModels #LocalAI #Gradio #BuildSmall | |
| π£ **Posted on LinkedIn:** https://www.linkedin.com/posts/sean-poyner_buildsmall-smallmodels-localai-share-7472421438109650944-bQGy/ | |