Spaces:
Running
title: mAIndlock
emoji: ๐ง
colorFrom: gray
colorTo: green
sdk: docker
app_port: 7860
pinned: true
license: mit
short_description: Escape room where every NPC is a mortal mind of tiny LLMs
models:
- openbmb/MiniCPM5-1B-GGUF
- openbmb/VoxCPM2
- nvidia/NVIDIA-Nemotron-3-Nano-4B-GGUF
tags:
- thousand-token-wood
- off-the-grid
- off-brand
- tiny-titan
- llama-champion
- llama-cpp
- openbmb
- minicpm
- nemotron
- agents
- game
- neuroscience
๐ง mAIndlock
An escape room where the lock is a mind โ and you can kill it.
โถ Watch the 100-second demo ยท ๐ฎ Play it live ยท ๐ง Deliberation traces
Every character is not one chatbot with a personality prompt. It is a hierarchy of six tiny offline language models, the way decision neuroscience says a brain is built. You don't crack a code. You change a person's decision by reaching their fears and the memories they keep โ or you burn their mind down trying.
Cruelty makes a mind ruminate, and rumination spends its finite thinking tokens. As they burn, it forgets โ for good. At zero, the mind goes dark, taking everything it knew with it. There is no reload. Only the next room, and your reputation travelling ahead of you.
Tokens are a lifespan here, not a context window.
โ๏ธ TL;DR for judges
- Track โ Thousand Token Wood. "A thousand tokens to think with" is literal: every mind starts with 1000 thinking tokens of life. Spend them cruelly and it dies.
- OpenBMB ($10k). Six of the ~seven model calls per NPC turn are MiniCPM โ the four sensory brain regions (threat, memory, habit, cost), each a separate 1B call. MiniCPM is the brain. (more under fear: the amygdala ruminates, firing extra calls that burn life) The story's key-handover lines and the voiced demo are rendered with OpenBMB VoxCPM2 TTS โ a second model from the family, fully offline.
- NVIDIA. Nemotron 3 Nano 4B is the voice you actually argue with โ the dlPFC that turns the integrated value into words.
- Off the Grid / Llama Champion. Pure llama.cpp, zero cloud APIs. Flip on airplane mode; every mind keeps thinking. Token counts come from the runtime, so the life-burn is honest accounting.
- Off-Brand. A custom canvas game served from FastAPI, with a Gradio block mounted at
/aboutโ not a stock Gradio UI. - No CPU patience required: menu โ ๐ Watch a mind replays a real recorded session instantly (zero model calls), so the six-region cascade, the token burn and a mind's death are visible in ten seconds.
The brain is real, not a metaphor: docs/ARCHITECTURE.md ยท read one mind's full deliberation in docs/TRACE.md.
Every NPC is a brain, not a chatbot
Your words travel through six regions โ amygdala โ hippocampus โ striatum โ ACC โ vmPFC โ dlPFC โ each one a real call to a small local model. The amygdala rates threat; the hippocampus surfaces a memory and whether it leans trust or fear; the striatum weighs habit; the ACC weighs cost. The vmPFC integrates them deterministically into one value โ so the number you see in the skull is the number that moves the relationship; the panel can never lie about the outcome. The dlPFC speaks it in character.
Open the skull mid-conversation (๐ง ) and watch the regions argue about you in real time โ each showing its conviction, read straight from the model's own token entropy. A hosted chat API never exposes that. Only a local mind can.
Grounding is modern decision neuroscience โ the value-based network and the dual-system (model-free habit vs. model-based goal) accounts โ deliberately not the debunked triune "lizard brain." Acute stress shifts control from goal to habit (Schwabe & Wolf, 2009): that is literally why lowering a character's fear unlocks their reasoning, and why fear burns life for nothing.
A life measured in words
Be cruel and the amygdala loops, spending tokens that move nothing. Each quarter of life lost burns a memory away โ the hippocampus genuinely loses it, and the Forgotten panel shows what's gone. Push far enough and the mind dies, leaving a savable epitaph: what it knew, what it never got to tell. Be kind, keep the alarm quiet, and the mind spends almost nothing โ empathy literally spares it.
Code rules, the model dreams
Death, the key, rapport and the burn are deterministic game state. The small models generate โ they never get to fabricate the outcome. A 4B voice can decorate the story's spine but cannot rewrite who lives. That is what makes the stakes real instead of vibes.
๐ช Story mode โ ten rooms of one man's memory
A man wakes in a ward with no name. Every room is a fragment of his own mind; every NPC holds a piece of who he is โ alive โ home โ sorry โ trains โ โฆ โ until the last room shows him the one task he should never have solved. No character tells you what happened; each gives the piece they know, and the tragedy assembles itself in your head, which is exactly where it takes place. Past the story, Endless mode generates minds forever, and a built-in level editor lets you rewrite any mind and save it.
โ๏ธ Small models, doing the carrying
| Role | Model | Size |
|---|---|---|
| Sensory regions โ threat, memory, habit, cost (4 calls/turn) | MiniCPM (OpenBMB) | 1B |
| Voice (dlPFC) โ the words said aloud | Nemotron 3 Nano (NVIDIA) | 4B |
| Spoken voice โ story lines & demo (TTS) | VoxCPM2 (OpenBMB) | โ |
| Integration (vmPFC) | deterministic value network | 0 |
Runtime llama.cpp (the Space) / Ollama (laptop). Total weights โค 5.3B, fully offline.
๐ Eligibility โ feature โ lane
| Feature | Lane / badge |
|---|---|
| 6 of ~7 calls per NPC turn are MiniCPM (the four sensory regions) | OpenBMB $10k |
| OpenBMB VoxCPM2 TTS voices the story spine + the demo | OpenBMB $10k (family breadth) |
| Nemotron 3 Nano 4B as the executive voice (dlPFC) | NVIDIA |
| 1000-token life, fear-burn, death by token exhaustion | Thousand Token Wood |
| Pure llama.cpp, airplane-mode proof, no cloud APIs | Off the Grid ยท Llama Champion |
Custom canvas front, Gradio mounted at /about |
Off-Brand |
| Four MiniCPM-1B sensory regions per mind | Tiny Titan |
| Brain-region deliberation traces published on the Hub | Sharing is Caring |
| Department LoRA fine-tune of MiniCPM-V 4.6 (sensory-region behavior) | Well-Tuned |
| Two write-ups โ the neuroscience ยท the engineering | Field Notes |
| Intra-NPC six-region cognitive hierarchy | Best Agent |
How to play
- WASD / arrows move ยท E talk / use the door ยท ๐ง or /brain open the skull
- Listen, learn the word that reaches each keeper, and say it like you mean it.
- In a hurry? Menu โ ๐ Watch a mind โ a real recorded session, no waiting.
๐ค Built small for the Hugging Face ร Gradio hackathon โ Thousand Token Wood track. Minds by OpenBMB MiniCPM and NVIDIA Nemotron, on llama.cpp, fully offline. Original soundtrack by DjinAscet.
Links โ โถ Demo video ยท ๐ป Code ยท ๐ Traces dataset ยท ๐ง Dept LoRA ยท โ๏ธ Blog: the science ยท the engineering ยท ๐ฃ Social: X ยท LinkedIn