smolcode / README.md
seanpoyner's picture
Card: refreshed demo video + Rust/learned-router framing
ad293ce verified
|
Raw
History Blame Contribute Delete
5.04 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: smolcode
emoji: πŸ€–
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: 5.50.0
python_version: '3.12'
app_file: app.py
pinned: false
license: apache-2.0
short_description: A tiny local model that writes code, runs it, and fixes it.
tags:
  - build-small-hackathon
  - agent
  - code-generation
  - gradio

smolcode πŸ€–

A tiny local model that writes code, runs it, and fixes it β€” until it works.

smolcode is an agentic coding assistant built for small language models. Instead of autocompleting, it runs a plan β†’ write β†’ execute β†’ repair loop: it writes a file, runs it in a sandbox, reads the real error, and iterates until a test passes β€” on a model small enough to run on your own machine (a ≀4B model on a laptop, scaling up to 32B on a workstation). No cloud APIs.

Built for the Hugging Face Γ— Gradio Build Small Hackathon.

Why it's a "Build Small" entry

  • Agentic on a 3B model. The loop β€” not the model size β€” does the work. A ≀4B model drives tool calls reliably enough to write, run, and self-correct code.
  • Local-first & private. Talks to any OpenAI-compatible endpoint (Ollama, llama.cpp). Nothing leaves your machine.
  • Specialty routing. A 2D router classifies tasks into 16 language/function families and escalates within each family's fine-tuned ladder before falling back to bigger Granite models.
  • Fine-tuned tiny coder. We fine-tuned Qwen2.5-Coder-1.5B to emit native tool calls so a ≀2B model can be the cheap entry tier β€” published at seanpoyner/smolcode-coder-1.5b-tools.
  • Rust core. Agent loop, tool execution, and tracing run through LiteForge and smolcode-core (Rust/PyO3). Gradio is the (required) shell; the brain is Rust.

How to use this Space

  1. Type a coding task, e.g. "write a function that validates an email and test it."
  2. Watch the agent trace stream live: write_file β†’ run_python β†’ (error) β†’ fix β†’ pass.
  3. The router badge shows which tier solved it and whether it's βœ“ verified.
  4. Tick ⚑ fan out and enter several lines to run independent tasks as parallel subagents.

Benchmark β€” the loop is the product

The agentic loop is what makes a tiny model useful. On the same HumanEval-style suite (bench/tasks.py, 10 tasks, pass@1):

System Model pass@1
single-shot fine-tuned 1.5B 50%
agentic loop fine-tuned 1.5B 70%
single-shot granite4.1:3b 90%

The write→run→fix loop lifts the fine-tuned 1.5B from 50% → 70% (+20 pts) — the loop, not raw model size, does the work. A larger model (granite 3B) scores higher single-shot, which is exactly why the router escalates only when the small tier can't verify. Measured with bench/run.py on the hal backend.

Under the hood

Gradio UI  β†’  smolcode-core / LiteForge (Rust/PyO3)  β†’  OpenAI-compatible endpoint
                  specialty router + agent loop
                  tools: write_file, read_file, run_python, run_tests
                  served by Ollama / llama.cpp (local, HAL LAN, or public Modal+Ollama)

The public demo serves the whole specialist matrix + Granite ladder from one Modal container running Ollama, so the specialty router escalates for real in the cloud β€” same engine, just an endpoint change. See SPACE_DEPLOY.md option (c).

There's also a full terminal agent (smolcode-cli, a Rust ratatui TUI) and a Replit/Lovable-style app builder (smolbuilder.py) on the same engine.

Demo video

▢️ Watch the demo β€” the agent writes code, runs it, fixes the failing test, and shows the router tier that solved it.

Share

Most coding tasks don't need a giant model. smolcode is an agentic coding agent that runs entirely on a small local model β€” it writes the code, runs it, reads the real error, and fixes itself until tests pass. Fine-tuned 1.5B coder; the router escalates a tier only when needed (all ≀32B). Less compute, same result.

Built for the #BuildSmall hackathon with @huggingface + @Gradio. πŸ¦€ Rust core. ▢️ https://huggingface.co/spaces/seanpoyner/smolcode #SmallModels #LocalAI #Gradio #BuildSmall

πŸ“£ Posted on LinkedIn: https://www.linkedin.com/posts/sean-poyner_buildsmall-smallmodels-localai-share-7472421438109650944-bQGy/