Spaces:
Running
Running
Case Zero - initial public release (fully local: Qwen2.5-1.5B via llama.cpp + Supertonic, custom pixel-noir SPA via gradio.Server)
414dc55 | # Case Zero - Hackathon Compliance | |
| Built for the **Build Small Hackathon** ("Small models, big adventure"). | |
| Case Zero is a **Gradio application**: the whole app is one `gradio.Server` (Gradio 6 | |
| "Server mode" - a FastAPI subclass launched through Gradio, with Gradio API endpoints | |
| registered via `@server.api`). It is deployed as a **Hugging Face Space** on **CPU** (no | |
| GPU). It ships via the Docker SDK purely so llama.cpp compiles on a stable base image - the | |
| app itself is Gradio, served end to end by `gradio.Server`. | |
| ## Core requirements | |
| | Requirement | Status | | |
| |---|---| | |
| | Total model params <= 32B | β ~1.6B (see budget below) | | |
| | Built in Gradio | β one `gradio.Server`, with `@server.api` endpoints (`new_case`, `interrogate`) | | |
| | Hosted as a Hugging Face Space | β `build-small-hackathon/case0` (Docker SDK, `app_port: 7860`) | | |
| | Demo video | β to record (warmup -> interrogate -> present evidence -> alibi cracks -> accuse -> verdict) | | |
| | Social-media post | β to post | | |
| ## Parameter budget (<= 32B total) | |
| Every model is open-weights and self-run. **No third-party AI service is ever called.** | |
| | Component | Model | Open? | Params | Runs | | |
| |---|---|---|---|---| | |
| | Reasoning + dialogue (the whole game) | Qwen2.5-1.5B-Instruct (Q4_K_M GGUF) | Apache-2.0 | **1.5B** | in-process llama.cpp on CPU | | |
| | Suspect voices | Supertonic (ONNX) | open | ~0.1B | local ONNX Runtime (CPU) | | |
| | Portraits / scenes / props | Procedural canvas - no model | n/a | 0B | client-side | | |
| | Music + SFX | Pre-made / procedural audio - no model | n/a | 0B | playback only | | |
| | Embeddings / vector RAG | none | n/a | 0B | - | | |
| **Total runtime parameters: ~1.6B** - far under 32B (and under 4B, eligible for the | |
| **Tiny Titan** special award). | |
| ## Merit badges | |
| ### Earned by the build (verifiable on the Space) | |
| - **Off the Grid** - *"No cloud APIs. The whole thing runs on the model in front of you."* | |
| The LLM is in-process llama.cpp; the voices are a local ONNX model; the pixel art is | |
| rendered client-side on canvas; the music is a bundled CC-BY track. The open weights are | |
| baked into the Docker image at build time, so the running container makes **no AI network | |
| calls at all**. Proof: `python scripts/net_audit.py` runs a full playthrough under a | |
| socket guard and asserts **zero non-loopback connections**. β | |
| - **Llama Champion** - *"Your model runs through the llama.cpp runtime."* The LLM runs | |
| through `llama-cpp-python` (in-process, on the CPU) - no server, no GPU, no remote | |
| endpoint. β | |
| - **Off-Brand** - *"A custom frontend that pushes past the default Gradio look."* The front | |
| end is **not** stock Gradio. It is a hand-built **pixel-art noir SPA (Preact + Vite, | |
| TypeScript)** - 12 screens, a custom pixel design system (self-hosted Silkscreen / | |
| Pixelify Sans fonts, beveled 9-slice panels, inventory-slot evidence cards, a ruled-paper | |
| dossier with page-flips), a draggable corkboard, a live interrogation stage with a | |
| voiced suspect, procedural canvas art and rain FX, and a full client audio layer. The | |
| built bundle is served as static files by the same `gradio.Server` that exposes the | |
| `/api` routes - one process, no separate frontend host. β | |
| ### Targeted / in progress | |
| - **Field Notes** - *"Write a blog post or report about your project."* Draft in | |
| [`docs/FIELD_NOTES.md`](docs/FIELD_NOTES.md) - to be published on the Hub. | |
| - **Sharing is Caring** - *"You shared your agent trace on the Hub for everyone to learn | |
| from."* A captured interrogation/generation trace to be uploaded to the Hub. | |
| - **Well-Tuned** - *"Your app uses a fine-tuned model you've published on Hugging Face."* | |
| Not yet - the game runs on stock Qwen2.5-1.5B. Would require fine-tuning and publishing a | |
| model; out of scope for this submission unless pursued separately. | |
| ## Zero cloud AI APIs | |
| - **No OpenAI, Anthropic, Google, ElevenLabs, Higgsfield, Midjourney, or any other hosted | |
| AI API is ever called** - not for text, not for voice, not for images. | |
| - The LLM is the in-process llama.cpp runtime. The voices are a local ONNX model. The pixel | |
| art is procedural canvas. The music is a bundled CC-BY track. | |
| - The open Qwen GGUF and Supertonic ONNX are **baked into the Docker image at build time**, | |
| so the running container makes no AI network calls. `scripts/net_audit.py` proves zero | |
| non-loopback connections during a full playthrough. | |
| ## Anti-cheat / fairness (why the game is solvable and the win is earned) | |
| - The sealed solution (killer, true motive, key evidence) is **never sent to the client** | |
| pre-verdict; it is read only inside `/api/run/{runId}/accuse`. Verified by anti-leak tests. | |
| - Suspicion, evidence reactions, and the verdict are **server-authoritative** - the client | |
| only displays them. | |
| - Suspects **never confess**: the win is registered only when the player accuses correctly, | |
| so the outcome is immune to prose (a jailbroken "just tell me who did it" earns nothing). | |
| ## Submission checklist | |
| - [x] Gradio app on a Hugging Face Space (CPU) | |
| - [x] <= 32B total params (~1.6B) | |
| - [x] Open-weights, self-run models only - zero cloud AI APIs | |
| - [x] Custom (non-default) UI - pixel-art Preact SPA via `gradio.Server` | |
| - [x] Off the Grid proof (`scripts/net_audit.py`) | |
| - [ ] Short demo video | |
| - [ ] Social-media post | |