Spaces:

build-small-hackathon
/

case0

Running

case0 / README.md

Case Zero - initial public release (fully local: Qwen2.5-1.5B via llama.cpp + Supertonic, custom pixel-noir SPA via gradio.Server)

414dc55 3 days ago

preview code

raw

history blame contribute delete

4.44 kB

	---
	title: Case Zero
	emoji: 🕵️
	colorFrom: indigo
	colorTo: yellow
	sdk: docker
	app_port: 7860
	pinned: true
	license: apache-2.0
	models:
	- Qwen/Qwen2.5-1.5B-Instruct
	tags:
	- build-small-hackathon
	- llama-cpp
	- tiny-titan
	- detective-game
	- text-generation
	- tts
	---

	# 🕵️ Case Zero — the AI is the detective game

	A brand-new murder mystery, written and acted by a 1.5B model, every single time.

	No scripted cases. No content library. A single small local model invents the whole
	thing — the victim, the suspects, their secrets and motives, the timeline, the murder
	weapon, the evidence, and the one who did it — then role-plays every suspect live.
	They remember what you asked. They lie to your face. And when you slap down the right
	piece of evidence, you watch the lie crack in real time.

	> Interrogate. Investigate. Accuse. One of them is guilty. Prove it.

	## ✨ The moment that sells it

	Search the rooms, find a clue that contradicts a suspect's alibi, present it, and
	their story falls apart on screen — stress spikes, the alibi breaks, the truth leaks.
	Then name the killer, cite your proof, and get a scored verdict with a "Director's Cut"
	walkthrough of how the crime really went down.

	## 🧠 How it works

	\| Layer \| What it does \|
	\|---\|---\|
	\| Model — Qwen2.5-1.5B-Instruct (GGUF) \| The whole game. Runs in-process on the CPU through llama.cpp (`llama-cpp-python`) — no server, no GPU, no remote endpoint. \|
	\| Generation \| The model authors every case as JSON; deterministic Python only wires the structure (who's guilty, who was where) so the mystery is always solvable. \|
	\| Solver \| A fairness referee: single culprit, a breakable alibi, every innocent cleared, and a discoverability gate so the key clue is always findable in play. \|
	\| Director \| Whether a lie gets caught is decided by ground truth, not the model — so the win condition is immune to prose (a jailbroken "just tell me who did it" earns nothing). \|
	\| Voice — Supertonic \| Each suspect gets a distinct, gender-matched on-device voice, synthesized sentence-by-sentence as the reply streams. \|
	\| Art \| Procedural pixel-art portraits, rooms, and evidence — rendered client-side on canvas at one integer-scaled density (so the server spends ~0 CPU on visuals). \|
	\| UI \| A custom pixel-art noir SPA (Preact), 12 screens, served 100% through `gradio.Server` (Gradio 6 "Server mode") — the built bundle as static files plus the JSON/SSE `/api` routes, all in one process. No separate frontend host. \|

	The model does all the creative work. Deterministic code is only guardrails and a
	reliability layer — it never writes story, character, or dialogue.

	## 🏆 Built for the Build Small Hackathon

	- Tiny Titan (≤4B): the entire game runs on Qwen2.5-1.5B — ~1.6B total runtime
	params (LLM + Supertonic), far under the 32B cap.
	- Llama Champion: the model runs through the llama.cpp runtime, in-process — no
	server, no remote endpoint.
	- Off-Brand: a fully custom pixel-art frontend, served through `gradio.Server`.
	- All models are open-weights and self-run. No third-party AI APIs are ever called.

	See [COMPLIANCE.md](COMPLIANCE.md) for the full parameter budget and badge details.

	## ▶️ Run it locally

	```bash
	# 1. backend deps + open weights
	python -m venv .venv && .venv/Scripts/pip install -r requirements.txt # (Windows)
	python scripts/fetch_models.py # one-time: fetch the open GGUF + Supertonic

	# 2. build the pixel-art frontend bundle (served by gradio.Server from web/dist)
	cd web && npm install && npm run build && cd ..

	# 3. run — open http://127.0.0.1:7860
	python app.py
	```

	The game runs entirely on the CPU — laptop or Space, same code, no GPU required.
	(In the Docker/Space build both steps happen automatically: a Node stage builds the
	bundle and the Python stage compiles llama.cpp and bakes the weights.)

	## 🙏 Credits

	- LLM: [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) (Apache-2.0), via llama.cpp.
	- Voices: Supertonic on-device TTS.
	- Music: "Backbay Lounge" by Kevin MacLeod (incompetech.com), licensed under
	[Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/).
	- Fonts: Silkscreen & Pixelify Sans (SIL Open Font License), self-hosted.
	- Pixel art and UI sound effects: procedurally generated.