Spaces:
Running
title: Case Zero
emoji: π΅οΈ
colorFrom: indigo
colorTo: yellow
sdk: docker
app_port: 7860
pinned: true
license: apache-2.0
models:
- Qwen/Qwen2.5-1.5B-Instruct
tags:
- build-small-hackathon
- llama-cpp
- tiny-titan
- detective-game
- text-generation
- tts
π΅οΈ Case Zero β the AI is the detective game
A brand-new murder mystery, written and acted by a 1.5B model, every single time.
No scripted cases. No content library. A single small local model invents the whole thing β the victim, the suspects, their secrets and motives, the timeline, the murder weapon, the evidence, and the one who did it β then role-plays every suspect live. They remember what you asked. They lie to your face. And when you slap down the right piece of evidence, you watch the lie crack in real time.
Interrogate. Investigate. Accuse. One of them is guilty. Prove it.
β¨ The moment that sells it
Search the rooms, find a clue that contradicts a suspect's alibi, present it, and their story falls apart on screen β stress spikes, the alibi breaks, the truth leaks. Then name the killer, cite your proof, and get a scored verdict with a "Director's Cut" walkthrough of how the crime really went down.
π§ How it works
| Layer | What it does |
|---|---|
| Model β Qwen2.5-1.5B-Instruct (GGUF) | The whole game. Runs in-process on the CPU through llama.cpp (llama-cpp-python) β no server, no GPU, no remote endpoint. |
| Generation | The model authors every case as JSON; deterministic Python only wires the structure (who's guilty, who was where) so the mystery is always solvable. |
| Solver | A fairness referee: single culprit, a breakable alibi, every innocent cleared, and a discoverability gate so the key clue is always findable in play. |
| Director | Whether a lie gets caught is decided by ground truth, not the model β so the win condition is immune to prose (a jailbroken "just tell me who did it" earns nothing). |
| Voice β Supertonic | Each suspect gets a distinct, gender-matched on-device voice, synthesized sentence-by-sentence as the reply streams. |
| Art | Procedural pixel-art portraits, rooms, and evidence β rendered client-side on canvas at one integer-scaled density (so the server spends ~0 CPU on visuals). |
| UI | A custom pixel-art noir SPA (Preact), 12 screens, served 100% through gradio.Server (Gradio 6 "Server mode") β the built bundle as static files plus the JSON/SSE /api routes, all in one process. No separate frontend host. |
The model does all the creative work. Deterministic code is only guardrails and a reliability layer β it never writes story, character, or dialogue.
π Built for the Build Small Hackathon
- Tiny Titan (β€4B): the entire game runs on Qwen2.5-1.5B β ~1.6B total runtime params (LLM + Supertonic), far under the 32B cap.
- Llama Champion: the model runs through the llama.cpp runtime, in-process β no server, no remote endpoint.
- Off-Brand: a fully custom pixel-art frontend, served through
gradio.Server. - All models are open-weights and self-run. No third-party AI APIs are ever called.
See COMPLIANCE.md for the full parameter budget and badge details.
βΆοΈ Run it locally
# 1. backend deps + open weights
python -m venv .venv && .venv/Scripts/pip install -r requirements.txt # (Windows)
python scripts/fetch_models.py # one-time: fetch the open GGUF + Supertonic
# 2. build the pixel-art frontend bundle (served by gradio.Server from web/dist)
cd web && npm install && npm run build && cd ..
# 3. run β open http://127.0.0.1:7860
python app.py
The game runs entirely on the CPU β laptop or Space, same code, no GPU required. (In the Docker/Space build both steps happen automatically: a Node stage builds the bundle and the Python stage compiles llama.cpp and bakes the weights.)
π Credits
- LLM: Qwen2.5-1.5B-Instruct (Apache-2.0), via llama.cpp.
- Voices: Supertonic on-device TTS.
- Music: "Backbay Lounge" by Kevin MacLeod (incompetech.com), licensed under Creative Commons Attribution 4.0.
- Fonts: Silkscreen & Pixelify Sans (SIL Open Font License), self-hosted.
- Pixel art and UI sound effects: procedurally generated.