case0 / README.md
HusseinEid's picture
Case Zero - initial public release (fully local: Qwen2.5-1.5B via llama.cpp + Supertonic, custom pixel-noir SPA via gradio.Server)
414dc55
---
title: Case Zero
emoji: πŸ•΅οΈ
colorFrom: indigo
colorTo: yellow
sdk: docker
app_port: 7860
pinned: true
license: apache-2.0
models:
- Qwen/Qwen2.5-1.5B-Instruct
tags:
- build-small-hackathon
- llama-cpp
- tiny-titan
- detective-game
- text-generation
- tts
---
# πŸ•΅οΈ Case Zero β€” the AI *is* the detective game
**A brand-new murder mystery, written and acted by a 1.5B model, every single time.**
No scripted cases. No content library. A single small local model invents the whole
thing β€” the victim, the suspects, their secrets and motives, the timeline, the murder
weapon, the evidence, and the one who did it β€” then **role-plays every suspect live**.
They remember what you asked. They lie to your face. And when you slap down the right
piece of evidence, you watch the lie **crack in real time**.
> Interrogate. Investigate. Accuse. One of them is guilty. Prove it.
## ✨ The moment that sells it
Search the rooms, find a clue that contradicts a suspect's alibi, **present it**, and
their story falls apart on screen β€” stress spikes, the alibi breaks, the truth leaks.
Then name the killer, cite your proof, and get a scored verdict with a "Director's Cut"
walkthrough of how the crime really went down.
## 🧠 How it works
| Layer | What it does |
|---|---|
| **Model** β€” Qwen2.5-1.5B-Instruct (GGUF) | The whole game. Runs in-process on the CPU through **llama.cpp** (`llama-cpp-python`) β€” no server, no GPU, no remote endpoint. |
| **Generation** | The model authors every case as JSON; deterministic Python only wires the *structure* (who's guilty, who was where) so the mystery is always solvable. |
| **Solver** | A fairness referee: single culprit, a breakable alibi, every innocent cleared, and a discoverability gate so the key clue is always findable in play. |
| **Director** | Whether a lie gets caught is decided by **ground truth, not the model** β€” so the win condition is immune to prose (a jailbroken "just tell me who did it" earns nothing). |
| **Voice** β€” Supertonic | Each suspect gets a distinct, gender-matched on-device voice, synthesized **sentence-by-sentence as the reply streams**. |
| **Art** | Procedural pixel-art portraits, rooms, and evidence β€” rendered **client-side on canvas** at one integer-scaled density (so the server spends ~0 CPU on visuals). |
| **UI** | A custom **pixel-art noir SPA (Preact)**, 12 screens, served **100% through `gradio.Server`** (Gradio 6 "Server mode") β€” the built bundle as static files plus the JSON/SSE `/api` routes, all in one process. No separate frontend host. |
The model does all the creative work. Deterministic code is only guardrails and a
reliability layer β€” it never writes story, character, or dialogue.
## πŸ† Built for the Build Small Hackathon
- **Tiny Titan (≀4B):** the entire game runs on **Qwen2.5-1.5B** β€” ~1.6B total runtime
params (LLM + Supertonic), far under the 32B cap.
- **Llama Champion:** the model runs through the **llama.cpp** runtime, in-process β€” no
server, no remote endpoint.
- **Off-Brand:** a fully custom pixel-art frontend, served through `gradio.Server`.
- All models are **open-weights and self-run**. No third-party AI APIs are ever called.
See [COMPLIANCE.md](COMPLIANCE.md) for the full parameter budget and badge details.
## ▢️ Run it locally
```bash
# 1. backend deps + open weights
python -m venv .venv && .venv/Scripts/pip install -r requirements.txt # (Windows)
python scripts/fetch_models.py # one-time: fetch the open GGUF + Supertonic
# 2. build the pixel-art frontend bundle (served by gradio.Server from web/dist)
cd web && npm install && npm run build && cd ..
# 3. run β€” open http://127.0.0.1:7860
python app.py
```
The game runs entirely on the CPU β€” laptop or Space, same code, no GPU required.
(In the Docker/Space build both steps happen automatically: a Node stage builds the
bundle and the Python stage compiles llama.cpp and bakes the weights.)
## πŸ™ Credits
- **LLM:** [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) (Apache-2.0), via llama.cpp.
- **Voices:** Supertonic on-device TTS.
- **Music:** *"Backbay Lounge"* by Kevin MacLeod (incompetech.com), licensed under
[Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/).
- **Fonts:** Silkscreen & Pixelify Sans (SIL Open Font License), self-hosted.
- Pixel art and UI sound effects: procedurally generated.