Spaces:
Runtime error
Runtime error
File size: 2,093 Bytes
d2076fc bdf3624 89e66bf d2076fc 3f45f47 7f0712e 441317c 7f0712e d2076fc c75d321 89e66bf 32d6660 cb6ffb8 bdf3624 58f63db 0bbb564 d2076fc 10f3850 d2076fc 32d6660 3f45f47 c75d321 3f45f47 32d6660 10f3850 32d6660 c75d321 3f45f47 c75d321 3f45f47 32d6660 3f45f47 c75d321 8bef568 3f45f47 32d6660 3f45f47 c75d321 8bef568 3f45f47 10f3850 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | ---
hackathon: Build Small (2026)
title: Dreadzone
emoji: 💬
colorFrom: yellow
colorTo: red
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
suggested_hardware: t4-small
license: artistic-2.0
short_description: Backrooms-inspired local GGUF experience
team:
- grimjim
tags:
- track:wood
- sponsor:openai
- sponsor:nvidia
- achievement:offgrid
- achievement:llama
social_media_post: https://www.linkedin.com/posts/jim-lai-038249_i-participated-in-the-build-small-hackathon-share-7472113354073853952-LA39/
---
An entry for the Build Small Hackathon (2026)
The track taken: Thousand Token Wood
Dreadzone is a Backrooms-inspired interactive fiction prototype that runs a
local GGUF model with `llama-cpp-python` and Gradio ChatInterface.
The app downloads
[`unsloth/NVIDIA-Nemotron-3-Nano-4B-GGUF`](https://huggingface.co/unsloth/NVIDIA-Nemotron-3-Nano-4B-GGUF)
automatically on first launch and streams responses from
`NVIDIA-Nemotron-3-Nano-4B-Q5_K_M.gguf`.
No hosted inference API, OAuth token, secrets, or external inference services are
used. The default dependency pin uses the CUDA 12.4 `llama-cpp-python` wheel for
GPU Spaces.
The Python app owns the lightweight game state: coordinates, turn count, sanity,
zone profile, and encounter rolls. The model receives hidden state each turn and
narrates the result without exposing coordinates or mechanics. There are a few
surprises to keep players on their toes.
## Runtime settings
The defaults are intentionally conservative while enabling GPU offload:
- `N_CTX=2048`
- `N_BATCH=128`
- `MAX_HISTORY_TURNS=6`
- `GAME_SEED=dreadzone`
- `N_THREADS` defaults to one fewer than the detected CPU count
- `N_GPU_LAYERS=-1` offloads all possible layers to GPU
- `ENABLE_THINKING=false` renders the model chat template with thinking disabled
You can override the model or runtime settings with Space variables:
- `MODEL_REPO`
- `MODEL_FILE`
- `MODEL_DIR`
- `GAME_SEED`
- `N_CTX`
- `N_BATCH`
- `N_THREADS`
- `N_GPU_LAYERS`
- `ENABLE_THINKING`
- `MAX_HISTORY_TURNS`
## Author
grimjim@huggingface
Assisted by Codex |