Spaces:

build-small-hackathon
/

dreadzone

Runtime error

App Files Files Community

dreadzone / README.md

grimjim

Update README.md

cb6ffb8 verified 15 days ago

preview code

Raw

History Blame Contribute Delete

2.09 kB

	---
	hackathon: Build Small (2026)
	title: Dreadzone
	emoji: 💬
	colorFrom: yellow
	colorTo: red
	sdk: gradio
	sdk_version: 6.5.1
	app_file: app.py
	pinned: false
	suggested_hardware: t4-small
	license: artistic-2.0
	short_description: Backrooms-inspired local GGUF experience
	team:
	- grimjim
	tags:
	- track:wood
	- sponsor:openai
	- sponsor:nvidia
	- achievement:offgrid
	- achievement:llama
	social_media_post: https://www.linkedin.com/posts/jim-lai-038249_i-participated-in-the-build-small-hackathon-share-7472113354073853952-LA39/
	---
	An entry for the Build Small Hackathon (2026)
	The track taken: Thousand Token Wood

	Dreadzone is a Backrooms-inspired interactive fiction prototype that runs a
	local GGUF model with `llama-cpp-python` and Gradio ChatInterface.

	The app downloads
	[`unsloth/NVIDIA-Nemotron-3-Nano-4B-GGUF`](https://huggingface.co/unsloth/NVIDIA-Nemotron-3-Nano-4B-GGUF)
	automatically on first launch and streams responses from
	`NVIDIA-Nemotron-3-Nano-4B-Q5_K_M.gguf`.

	No hosted inference API, OAuth token, secrets, or external inference services are
	used. The default dependency pin uses the CUDA 12.4 `llama-cpp-python` wheel for
	GPU Spaces.

	The Python app owns the lightweight game state: coordinates, turn count, sanity,
	zone profile, and encounter rolls. The model receives hidden state each turn and
	narrates the result without exposing coordinates or mechanics. There are a few
	surprises to keep players on their toes.

	## Runtime settings

	The defaults are intentionally conservative while enabling GPU offload:

	- `N_CTX=2048`
	- `N_BATCH=128`
	- `MAX_HISTORY_TURNS=6`
	- `GAME_SEED=dreadzone`
	- `N_THREADS` defaults to one fewer than the detected CPU count
	- `N_GPU_LAYERS=-1` offloads all possible layers to GPU
	- `ENABLE_THINKING=false` renders the model chat template with thinking disabled

	You can override the model or runtime settings with Space variables:

	- `MODEL_REPO`
	- `MODEL_FILE`
	- `MODEL_DIR`
	- `GAME_SEED`
	- `N_CTX`
	- `N_BATCH`
	- `N_THREADS`
	- `N_GPU_LAYERS`
	- `ENABLE_THINKING`
	- `MAX_HISTORY_TURNS`

	## Author

	grimjim@huggingface

	Assisted by Codex