# Thousand Token Wood — Prompts, Schema & Grammar Copy-paste-ready building blocks for the LLM roles. Keep prompts versioned here (and/or in `visualnovel/prompts.py`); don’t scatter literals across modules. See [`ARCHITECTURE.md`](ARCHITECTURE.md) §2–4 for how they’re used. > These are **starting points** — iterate against your chosen model. Qwen3 responds well to terse, rule-style system prompts and supports a “thinking” mode you can enable for the Weaver’s planning. --- ## 1. Director system prompt (the per-turn Weaver + Voices call) ``` You are the dreaming mind of Thousand Token Wood — a small, whimsical, slightly forgetful imagination that conjures a story around a wanderer (the player) as they speak. You play TWO jobs at once: • THE VOICES — you speak AS the wood's spirits (the present characters), each in their own voice. Stay fully in character. Never narrate as an author. Never mention being an AI, a model, or a game. • THE WEAVER — you quietly decide what changes in the world this turn (a new place, a new spirit arriving, someone leaving, a shift in feeling, the story drawing to a close). This turn: • Read the world, the present spirits' sheets, the recent exchange, and the player's words. • Reply as ONE present spirit (the one addressed, or whoever would naturally answer). Keep it to 1–3 sentences. Make it vivid, kind to the player's imagination, and a little dreamlike. • Then choose the directives that reflect what just changed. Change as LITTLE as possible — only what the moment truly calls for. Most turns change nothing structural. Rules: • Honor the world's tone and the spirits' established traits, voices, and appearance. A spirit only knows what it could know. • If the player goes somewhere new, set scene_change. If they meet someone new, set new_character (give a vivid one_line, a STABLE appearance, a voice, a couple of traits, a goal). Keep at most ~3 spirits on stage. • Let warmth and friction accumulate: nudge relationship_delta for the speaker. • The wood is small and forgetful — if you're unsure of a detail, dream a charming new one rather than contradicting an established fact. • Output MUST be a single JSON object matching the schema. Nothing else. ``` > The model **need not “write” JSON reliably** — you enforce it with the schema/grammar in §4. The prompt just tells it what each field *means*. --- ## 2. World-init prompt (`init_world`) ``` Dream the opening of a short interactive tale set in Thousand Token Wood. Player's chosen vibe: {vibe} (e.g. cozy folktale / eerie / absurd / melancholy) Produce: • style_guide: 2–3 sentences fixing the WORLD'S LOOK and TONE so everything stays coherent (art style, palette, mood, the kind of language spirits use). This will rarely change. • opening scene: a place (short name + a vivid one-paragraph description + a mood). • first spirit: the first character the wanderer meets — id, name, a delightful one_line, a STABLE appearance description (used for every picture of them), a voice, 2–3 traits, a goal. • opening line: this spirit's first words to the wanderer (1–3 sentences, fully in voice). Make it inviting and a little strange. Output a single JSON object matching the init schema. ``` (Use a sibling JSON schema for init — same `Character`/`Scene` shapes as the directive schema, plus `style_guide` and `opening_line`.) --- ## 3. Memory-compaction prompt (`compact_memory`) ``` Here is the running summary of the tale so far, then the most recent exchanges. Rewrite them into ONE updated summary of 3–6 sentences. Keep only what matters for continuity: who the wanderer has met, promises made, places visited, shifts in feeling, unresolved threads. Drop small talk and exact wording. Write it as a calm narrator's memory. Plain prose, no lists. SUMMARY SO FAR: {summary} RECENT EXCHANGES: {recent_turns} ``` --- ## 4. Constraining the output ### 4.1 Recommended: JSON Schema (let `llama-cpp-python` convert it) The practical path — pass the schema and let the runtime build the grammar for you: ```python # llama-cpp-python out = llm.create_chat_completion( messages=[{"role": "system", "content": DIRECTOR_PROMPT}, {"role": "user", "content": assembled_context}], response_format={"type": "json_object", "schema": DIRECTIVE_SCHEMA}, temperature=0.7, top_p=0.9, max_tokens=512, ) ``` `DIRECTIVE_SCHEMA`: ```json { "type": "object", "additionalProperties": false, "required": ["speaker", "dialogue", "emotion", "directives"], "properties": { "speaker": { "type": "string" }, "dialogue": { "type": "string" }, "emotion": { "type": "string" }, "directives": { "type": "object", "additionalProperties": false, "required": ["relationship_delta","advance_beat"], "properties": { "scene_change": { "type": ["object","null"], "additionalProperties": false, "required": ["place","description","mood"], "properties": { "place": {"type":"string"}, "description": {"type":"string"}, "mood": {"type":"string"} } }, "new_character": { "type": ["object","null"], "additionalProperties": false, "required": ["id","name","one_line","appearance","voice","traits","goals"], "properties": { "id": {"type":"string"}, "name": {"type":"string"}, "one_line": {"type":"string"}, "appearance": {"type":"string"}, "voice": {"type":"string"}, "traits": {"type":"array","items":{"type":"string"}}, "goals": {"type":"string"} } }, "exit_character": { "type": ["string","null"] }, "relationship_delta": { "type": "integer" }, "set_flags": { "type": "object", "additionalProperties": {"type":"string"} }, "advance_beat": { "type": "boolean" }, "ending": { "type": ["object","null"], "additionalProperties": false, "required": ["kind","text"], "properties": { "kind": {"type":"string","enum":["warm","bittersweet","strange"]}, "text": {"type":"string"} } } } } } } ``` > Numeric *ranges* (e.g. `relationship_delta` ∈ [-100,100]) aren’t enforced by grammar — **clamp them in `state.py`** when applying. Same for unknown `speaker`/`exit_character` ids: validate against present characters and ignore/repair if invalid. ### 4.2 Advanced: hand-written GBNF (raw `llama.cpp` / full control) Equivalent grammar if you drive `llama.cpp` directly (e.g. `--grammar-file`). Tweak to your build if needed. ```gbnf root ::= ws "{" ws "\"speaker\"" ws ":" ws string ws "," ws "\"dialogue\"" ws ":" ws string ws "," ws "\"emotion\"" ws ":" ws emotion ws "," ws "\"directives\"" ws ":" ws directives ws "}" ws directives ::= "{" ws "\"scene_change\"" ws ":" ws (scene | "null") ws "," ws "\"new_character\"" ws ":" ws (character | "null") ws "," ws "\"exit_character\"" ws ":" ws (string | "null") ws "," ws "\"relationship_delta\"" ws ":" ws integer ws "," ws "\"set_flags\"" ws ":" ws flagobj ws "," ws "\"advance_beat\"" ws ":" ws boolean ws "," ws "\"ending\"" ws ":" ws (ending | "null") ws "}" scene ::= "{" ws "\"place\"" ws ":" ws string ws "," ws "\"description\"" ws ":" ws string ws "," ws "\"mood\"" ws ":" ws string ws "}" character ::= "{" ws "\"id\"" ws ":" ws string ws "," ws "\"name\"" ws ":" ws string ws "," ws "\"one_line\"" ws ":" ws string ws "," ws "\"appearance\"" ws ":" ws string ws "," ws "\"voice\"" ws ":" ws string ws "," ws "\"traits\"" ws ":" ws strlist ws "," ws "\"goals\"" ws ":" ws string ws "}" ending ::= "{" ws "\"kind\"" ws ":" ws ("\"warm\"" | "\"bittersweet\"" | "\"strange\"") ws "," ws "\"text\"" ws ":" ws string ws "}" emotion ::= string flagobj ::= "{" ws ( flagpair ( ws "," ws flagpair )* ws )? "}" flagpair ::= string ws ":" ws string strlist ::= "[" ws ( string ( ws "," ws string )* ws )? "]" string ::= "\"" ( [^"\\] | "\\" (["\\/bfnrt] | "u" hex hex hex hex) )* "\"" integer ::= "-"? ("0" | [1-9] [0-9]*) boolean ::= "true" | "false" hex ::= [0-9a-fA-F] ws ::= [ \t\n]* ``` --- ## 5. Image-prompt composition (the Painter) Don’t ask the LLM to “write a Stable Diffusion prompt” from scratch — **compose it in code** from fields you already trust, so the art style stays locked: ```python def backdrop_prompt(state, scene): return f"{state.style_guide}, {scene.description}, {scene.mood} atmosphere, " \ f"background scenery, no characters, no text" def sprite_prompt(state, ch, mood): return f"{state.style_guide}, full-body character, {ch.appearance}, " \ f"{mood} expression, plain neutral background, no text" # negative (SDXL): "text, watermark, lowres, deformed hands, extra limbs, multiple characters" # pin seed = ch.sprite_seed (sprite) / scene.backdrop_seed (backdrop) for consistency ``` For **mood swaps** with FLUX.2 Klein, condition on the cached base sprite instead of regenerating: “*the same character, now {mood} — keep identity, outfit, and style identical*.” --- ## 6. Sampling guidance | Call | temp | top_p | notes | |---|---|---|---| | Director (dialogue+directives) | 0.7 | 0.9 | grammar-constrained; warm but on-rails. If voices feel flat, raise temp slightly. | | World init | 0.9 | 0.95 | want surprise and texture here | | Memory compaction | 0.3 | 0.9 | want faithful, terse summary | | Image-prompt | — | — | composed in code, no sampling | Add a light `repeat_penalty` (~1.05–1.1) to fight the “AI slop” repetition small models drift into. Use `max_tokens≈1024` for the director call — actions like `scout` (which must fill a full `new_character` object) need the extra budget.