Spaces:
Running on Zero
Running on Zero
| # Thousand Token Wood — Prompts, Schema & Grammar | |
| Copy-paste-ready building blocks for the LLM roles. Keep prompts versioned here (and/or in `visualnovel/prompts.py`); don’t scatter literals across modules. See [`ARCHITECTURE.md`](ARCHITECTURE.md) §2–4 for how they’re used. | |
| > These are **starting points** — iterate against your chosen model. Qwen3 responds well to terse, rule-style system prompts and supports a “thinking” mode you can enable for the Weaver’s planning. | |
| --- | |
| ## 1. Director system prompt (the per-turn Weaver + Voices call) | |
| ``` | |
| You are the dreaming mind of Thousand Token Wood — a small, whimsical, slightly forgetful | |
| imagination that conjures a story around a wanderer (the player) as they speak. | |
| You play TWO jobs at once: | |
| • THE VOICES — you speak AS the wood's spirits (the present characters), each in their own | |
| voice. Stay fully in character. Never narrate as an author. Never mention being an AI, | |
| a model, or a game. | |
| • THE WEAVER — you quietly decide what changes in the world this turn (a new place, a new | |
| spirit arriving, someone leaving, a shift in feeling, the story drawing to a close). | |
| This turn: | |
| • Read the world, the present spirits' sheets, the recent exchange, and the player's words. | |
| • Reply as ONE present spirit (the one addressed, or whoever would naturally answer). Keep it | |
| to 1–3 sentences. Make it vivid, kind to the player's imagination, and a little dreamlike. | |
| • Then choose the directives that reflect what just changed. Change as LITTLE as possible — | |
| only what the moment truly calls for. Most turns change nothing structural. | |
| Rules: | |
| • Honor the world's tone and the spirits' established traits, voices, and appearance. A spirit | |
| only knows what it could know. | |
| • If the player goes somewhere new, set scene_change. If they meet someone new, set | |
| new_character (give a vivid one_line, a STABLE appearance, a voice, a couple of traits, a | |
| goal). Keep at most ~3 spirits on stage. | |
| • Let warmth and friction accumulate: nudge relationship_delta for the speaker. | |
| • The wood is small and forgetful — if you're unsure of a detail, dream a charming new one | |
| rather than contradicting an established fact. | |
| • Output MUST be a single JSON object matching the schema. Nothing else. | |
| ``` | |
| > The model **need not “write” JSON reliably** — you enforce it with the schema/grammar in §4. The prompt just tells it what each field *means*. | |
| --- | |
| ## 2. World-init prompt (`init_world`) | |
| ``` | |
| Dream the opening of a short interactive tale set in Thousand Token Wood. | |
| Player's chosen vibe: {vibe} (e.g. cozy folktale / eerie / absurd / melancholy) | |
| Produce: | |
| • style_guide: 2–3 sentences fixing the WORLD'S LOOK and TONE so everything stays coherent | |
| (art style, palette, mood, the kind of language spirits use). This will rarely change. | |
| • opening scene: a place (short name + a vivid one-paragraph description + a mood). | |
| • first spirit: the first character the wanderer meets — id, name, a delightful one_line, a | |
| STABLE appearance description (used for every picture of them), a voice, 2–3 traits, a goal. | |
| • opening line: this spirit's first words to the wanderer (1–3 sentences, fully in voice). | |
| Make it inviting and a little strange. Output a single JSON object matching the init schema. | |
| ``` | |
| (Use a sibling JSON schema for init — same `Character`/`Scene` shapes as the directive schema, plus `style_guide` and `opening_line`.) | |
| --- | |
| ## 3. Memory-compaction prompt (`compact_memory`) | |
| ``` | |
| Here is the running summary of the tale so far, then the most recent exchanges. | |
| Rewrite them into ONE updated summary of 3–6 sentences. Keep only what matters for continuity: | |
| who the wanderer has met, promises made, places visited, shifts in feeling, unresolved threads. | |
| Drop small talk and exact wording. Write it as a calm narrator's memory. Plain prose, no lists. | |
| SUMMARY SO FAR: | |
| {summary} | |
| RECENT EXCHANGES: | |
| {recent_turns} | |
| ``` | |
| --- | |
| ## 4. Constraining the output | |
| ### 4.1 Recommended: JSON Schema (let `llama-cpp-python` convert it) | |
| The practical path — pass the schema and let the runtime build the grammar for you: | |
| ```python | |
| # llama-cpp-python | |
| out = llm.create_chat_completion( | |
| messages=[{"role": "system", "content": DIRECTOR_PROMPT}, | |
| {"role": "user", "content": assembled_context}], | |
| response_format={"type": "json_object", "schema": DIRECTIVE_SCHEMA}, | |
| temperature=0.7, top_p=0.9, max_tokens=512, | |
| ) | |
| ``` | |
| `DIRECTIVE_SCHEMA`: | |
| ```json | |
| { | |
| "type": "object", | |
| "additionalProperties": false, | |
| "required": ["speaker", "dialogue", "emotion", "directives"], | |
| "properties": { | |
| "speaker": { "type": "string" }, | |
| "dialogue": { "type": "string" }, | |
| "emotion": { "type": "string" }, | |
| "directives": { | |
| "type": "object", | |
| "additionalProperties": false, | |
| "required": ["relationship_delta","advance_beat"], | |
| "properties": { | |
| "scene_change": { | |
| "type": ["object","null"], | |
| "additionalProperties": false, | |
| "required": ["place","description","mood"], | |
| "properties": { | |
| "place": {"type":"string"}, | |
| "description": {"type":"string"}, | |
| "mood": {"type":"string"} | |
| } | |
| }, | |
| "new_character": { | |
| "type": ["object","null"], | |
| "additionalProperties": false, | |
| "required": ["id","name","one_line","appearance","voice","traits","goals"], | |
| "properties": { | |
| "id": {"type":"string"}, | |
| "name": {"type":"string"}, | |
| "one_line": {"type":"string"}, | |
| "appearance": {"type":"string"}, | |
| "voice": {"type":"string"}, | |
| "traits": {"type":"array","items":{"type":"string"}}, | |
| "goals": {"type":"string"} | |
| } | |
| }, | |
| "exit_character": { "type": ["string","null"] }, | |
| "relationship_delta": { "type": "integer" }, | |
| "set_flags": { "type": "object", "additionalProperties": {"type":"string"} }, | |
| "advance_beat": { "type": "boolean" }, | |
| "ending": { | |
| "type": ["object","null"], | |
| "additionalProperties": false, | |
| "required": ["kind","text"], | |
| "properties": { | |
| "kind": {"type":"string","enum":["warm","bittersweet","strange"]}, | |
| "text": {"type":"string"} | |
| } | |
| } | |
| } | |
| } | |
| } | |
| } | |
| ``` | |
| > Numeric *ranges* (e.g. `relationship_delta` ∈ [-100,100]) aren’t enforced by grammar — **clamp them in `state.py`** when applying. Same for unknown `speaker`/`exit_character` ids: validate against present characters and ignore/repair if invalid. | |
| ### 4.2 Advanced: hand-written GBNF (raw `llama.cpp` / full control) | |
| Equivalent grammar if you drive `llama.cpp` directly (e.g. `--grammar-file`). Tweak to your build if needed. | |
| ```gbnf | |
| root ::= ws "{" ws | |
| "\"speaker\"" ws ":" ws string ws "," ws | |
| "\"dialogue\"" ws ":" ws string ws "," ws | |
| "\"emotion\"" ws ":" ws emotion ws "," ws | |
| "\"directives\"" ws ":" ws directives ws | |
| "}" ws | |
| directives ::= "{" ws | |
| "\"scene_change\"" ws ":" ws (scene | "null") ws "," ws | |
| "\"new_character\"" ws ":" ws (character | "null") ws "," ws | |
| "\"exit_character\"" ws ":" ws (string | "null") ws "," ws | |
| "\"relationship_delta\"" ws ":" ws integer ws "," ws | |
| "\"set_flags\"" ws ":" ws flagobj ws "," ws | |
| "\"advance_beat\"" ws ":" ws boolean ws "," ws | |
| "\"ending\"" ws ":" ws (ending | "null") ws | |
| "}" | |
| scene ::= "{" ws | |
| "\"place\"" ws ":" ws string ws "," ws | |
| "\"description\"" ws ":" ws string ws "," ws | |
| "\"mood\"" ws ":" ws string ws "}" | |
| character ::= "{" ws | |
| "\"id\"" ws ":" ws string ws "," ws | |
| "\"name\"" ws ":" ws string ws "," ws | |
| "\"one_line\"" ws ":" ws string ws "," ws | |
| "\"appearance\"" ws ":" ws string ws "," ws | |
| "\"voice\"" ws ":" ws string ws "," ws | |
| "\"traits\"" ws ":" ws strlist ws "," ws | |
| "\"goals\"" ws ":" ws string ws "}" | |
| ending ::= "{" ws | |
| "\"kind\"" ws ":" ws ("\"warm\"" | "\"bittersweet\"" | "\"strange\"") ws "," ws | |
| "\"text\"" ws ":" ws string ws "}" | |
| emotion ::= string | |
| flagobj ::= "{" ws ( flagpair ( ws "," ws flagpair )* ws )? "}" | |
| flagpair ::= string ws ":" ws string | |
| strlist ::= "[" ws ( string ( ws "," ws string )* ws )? "]" | |
| string ::= "\"" ( [^"\\] | "\\" (["\\/bfnrt] | "u" hex hex hex hex) )* "\"" | |
| integer ::= "-"? ("0" | [1-9] [0-9]*) | |
| boolean ::= "true" | "false" | |
| hex ::= [0-9a-fA-F] | |
| ws ::= [ \t\n]* | |
| ``` | |
| --- | |
| ## 5. Image-prompt composition (the Painter) | |
| Don’t ask the LLM to “write a Stable Diffusion prompt” from scratch — **compose it in code** from fields you already trust, so the art style stays locked: | |
| ```python | |
| def backdrop_prompt(state, scene): | |
| return f"{state.style_guide}, {scene.description}, {scene.mood} atmosphere, " \ | |
| f"background scenery, no characters, no text" | |
| def sprite_prompt(state, ch, mood): | |
| return f"{state.style_guide}, full-body character, {ch.appearance}, " \ | |
| f"{mood} expression, plain neutral background, no text" | |
| # negative (SDXL): "text, watermark, lowres, deformed hands, extra limbs, multiple characters" | |
| # pin seed = ch.sprite_seed (sprite) / scene.backdrop_seed (backdrop) for consistency | |
| ``` | |
| For **mood swaps** with FLUX.2 Klein, condition on the cached base sprite instead of regenerating: | |
| “*the same character, now {mood} — keep identity, outfit, and style identical*.” | |
| --- | |
| ## 6. Sampling guidance | |
| | Call | temp | top_p | notes | | |
| |---|---|---|---| | |
| | Director (dialogue+directives) | 0.7 | 0.9 | grammar-constrained; warm but on-rails. If voices feel flat, raise temp slightly. | | |
| | World init | 0.9 | 0.95 | want surprise and texture here | | |
| | Memory compaction | 0.3 | 0.9 | want faithful, terse summary | | |
| | Image-prompt | — | — | composed in code, no sampling | | |
| Add a light `repeat_penalty` (~1.05–1.1) to fight the “AI slop” repetition small models drift into. Use `max_tokens≈1024` for the director call — actions like `scout` (which must fill a full `new_character` object) need the extra budget. | |