File size: 4,444 Bytes
414dc55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: Case Zero
emoji: πŸ•΅οΈ
colorFrom: indigo
colorTo: yellow
sdk: docker
app_port: 7860
pinned: true
license: apache-2.0
models:
  - Qwen/Qwen2.5-1.5B-Instruct
tags:
  - build-small-hackathon
  - llama-cpp
  - tiny-titan
  - detective-game
  - text-generation
  - tts
---

# πŸ•΅οΈ Case Zero β€” the AI *is* the detective game

**A brand-new murder mystery, written and acted by a 1.5B model, every single time.**

No scripted cases. No content library. A single small local model invents the whole
thing β€” the victim, the suspects, their secrets and motives, the timeline, the murder
weapon, the evidence, and the one who did it β€” then **role-plays every suspect live**.
They remember what you asked. They lie to your face. And when you slap down the right
piece of evidence, you watch the lie **crack in real time**.

> Interrogate. Investigate. Accuse. One of them is guilty. Prove it.

## ✨ The moment that sells it

Search the rooms, find a clue that contradicts a suspect's alibi, **present it**, and
their story falls apart on screen β€” stress spikes, the alibi breaks, the truth leaks.
Then name the killer, cite your proof, and get a scored verdict with a "Director's Cut"
walkthrough of how the crime really went down.

## 🧠 How it works

| Layer | What it does |
|---|---|
| **Model** β€” Qwen2.5-1.5B-Instruct (GGUF) | The whole game. Runs in-process on the CPU through **llama.cpp** (`llama-cpp-python`) β€” no server, no GPU, no remote endpoint. |
| **Generation** | The model authors every case as JSON; deterministic Python only wires the *structure* (who's guilty, who was where) so the mystery is always solvable. |
| **Solver** | A fairness referee: single culprit, a breakable alibi, every innocent cleared, and a discoverability gate so the key clue is always findable in play. |
| **Director** | Whether a lie gets caught is decided by **ground truth, not the model** β€” so the win condition is immune to prose (a jailbroken "just tell me who did it" earns nothing). |
| **Voice** β€” Supertonic | Each suspect gets a distinct, gender-matched on-device voice, synthesized **sentence-by-sentence as the reply streams**. |
| **Art** | Procedural pixel-art portraits, rooms, and evidence β€” rendered **client-side on canvas** at one integer-scaled density (so the server spends ~0 CPU on visuals). |
| **UI** | A custom **pixel-art noir SPA (Preact)**, 12 screens, served **100% through `gradio.Server`** (Gradio 6 "Server mode") β€” the built bundle as static files plus the JSON/SSE `/api` routes, all in one process. No separate frontend host. |

The model does all the creative work. Deterministic code is only guardrails and a
reliability layer β€” it never writes story, character, or dialogue.

## πŸ† Built for the Build Small Hackathon

- **Tiny Titan (≀4B):** the entire game runs on **Qwen2.5-1.5B** β€” ~1.6B total runtime
  params (LLM + Supertonic), far under the 32B cap.
- **Llama Champion:** the model runs through the **llama.cpp** runtime, in-process β€” no
  server, no remote endpoint.
- **Off-Brand:** a fully custom pixel-art frontend, served through `gradio.Server`.
- All models are **open-weights and self-run**. No third-party AI APIs are ever called.

See [COMPLIANCE.md](COMPLIANCE.md) for the full parameter budget and badge details.

## ▢️ Run it locally

```bash
# 1. backend deps + open weights
python -m venv .venv && .venv/Scripts/pip install -r requirements.txt   # (Windows)
python scripts/fetch_models.py     # one-time: fetch the open GGUF + Supertonic

# 2. build the pixel-art frontend bundle (served by gradio.Server from web/dist)
cd web && npm install && npm run build && cd ..

# 3. run β€” open http://127.0.0.1:7860
python app.py
```

The game runs entirely on the CPU β€” laptop or Space, same code, no GPU required.
(In the Docker/Space build both steps happen automatically: a Node stage builds the
bundle and the Python stage compiles llama.cpp and bakes the weights.)

## πŸ™ Credits

- **LLM:** [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) (Apache-2.0), via llama.cpp.
- **Voices:** Supertonic on-device TTS.
- **Music:** *"Backbay Lounge"* by Kevin MacLeod (incompetech.com), licensed under
  [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/).
- **Fonts:** Silkscreen & Pixelify Sans (SIL Open Font License), self-hosted.
- Pixel art and UI sound effects: procedurally generated.