Spaces:
Sleeping
Sleeping
metadata
title: Router Control Room (ZeroGPU)
emoji: π°οΈ
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit
short_description: ZeroGPU UI for CourseGPT-Pro router checkpoints
π°οΈ Router Control Room β ZeroGPU
This Space exposes the CourseGPT-Pro router checkpoints (Gemma3 27B + Qwen3 32B) with an opinionated Gradio UI. It runs entirely on ZeroGPU hardware using 8-bit loading so you can validate router JSON plans without paying for dedicated GPUs.
β¨ Whatβs Included
- Router-specific prompt builder β inject difficulty, tags, context, acceptance criteria, and additional guidance into the canonical router system prompt.
- Two curated checkpoints β
Router-Qwen3-32B-8bitandRouter-Gemma3-27B-8bit, both merged and quantized for ZeroGPU. - JSON extraction + validation β output is parsed automatically and checked for the required router fields (route_plan, todo_list, metrics, etc.).
- Raw output + prompt debug β inspect the verbatim generation and the exact prompt string sent to the checkpoint.
- One-click clear β reset the UI between experiments without reloading models.
π Workflow
- Describe the user task / homework prompt in the main textbox.
- Optionally provide context, acceptance criteria, and extra guidance.
- Choose the difficulty tier, tags, model, and decoding parameters.
- Click Generate Router Plan.
- Review:
- Raw Model Output β plain text returned by the LLM.
- Parsed Router Plan β JSON tree extracted from the output.
- Validation Panel β confirms whether all required fields are present.
- Full Prompt β copy/paste for repro or benchmarking.
If JSON parsing fails, the validation panel will surface the error so you can tweak decoding parameters or the prompt.
π§ Supported Models
| Name | Base | Notes |
|---|---|---|
Router-Qwen3-32B-8bit |
Qwen3 32B | Best overall acceptance on CourseGPT-Pro benchmarks. |
Router-Gemma3-27B-8bit |
Gemma3 27B | Slightly smaller, tends to favour math-first plans. |
Both checkpoints are merged + quantized in the Alovestocode namespace and require HF_TOKEN with read access.
βοΈ Local Development
cd Milestone-6/router-agent/zero-gpu-space
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export HF_TOKEN=hf_xxx
python app.py
π Notes
- The app always attempts 8-bit loading first (bitsandbytes). If that fails, it falls back to bf16/fp16/fp32.
- The UI enforces single-turn router generations; conversation history and web search are intentionally omitted to match the Milestone 6 deliverable.
- If you need to re-enable web search or more checkpoints, extend
MODELSand adjust the prompt builder accordingly. - Benchmarking: run
python Milestone-6/router-agent/tests/run_router_space_benchmark.py --space Alovestocode/ZeroGPU-LLM-Inference --limit 32(requirespip install gradio_client) to call the Space, dump predictions, and evaluate against the Milestone 5 hard suite + thresholds.