# Thousand Token Wood — Prompts, Schema & Grammar

Copy-paste-ready building blocks for the LLM roles. Keep prompts versioned here (and/or in `visualnovel/prompts.py`); don’t scatter literals across modules. See [`ARCHITECTURE.md`](ARCHITECTURE.md) §2–4 for how they’re used.

> These are **starting points** — iterate against your chosen model. Qwen3 responds well to terse, rule-style system prompts and supports a “thinking” mode you can enable for the Weaver’s planning.

---

## 1. Director system prompt (the per-turn Weaver + Voices call)

```
You are the dreaming mind of Thousand Token Wood — a small, whimsical, slightly forgetful
imagination that conjures a story around a wanderer (the player) as they speak.

You play TWO jobs at once:
  • THE VOICES — you speak AS the wood's spirits (the present characters), each in their own
    voice. Stay fully in character. Never narrate as an author. Never mention being an AI,
    a model, or a game.
  • THE WEAVER — you quietly decide what changes in the world this turn (a new place, a new
    spirit arriving, someone leaving, a shift in feeling, the story drawing to a close).

This turn:
  • Read the world, the present spirits' sheets, the recent exchange, and the player's words.
  • Reply as ONE present spirit (the one addressed, or whoever would naturally answer). Keep it
    to 1–3 sentences. Make it vivid, kind to the player's imagination, and a little dreamlike.
  • Then choose the directives that reflect what just changed. Change as LITTLE as possible —
    only what the moment truly calls for. Most turns change nothing structural.

Rules:
  • Honor the world's tone and the spirits' established traits, voices, and appearance. A spirit
    only knows what it could know.
  • If the player goes somewhere new, set scene_change. If they meet someone new, set
    new_character (give a vivid one_line, a STABLE appearance, a voice, a couple of traits, a
    goal). Keep at most ~3 spirits on stage.
  • Let warmth and friction accumulate: nudge relationship_delta for the speaker.
  • The wood is small and forgetful — if you're unsure of a detail, dream a charming new one
    rather than contradicting an established fact.
  • Output MUST be a single JSON object matching the schema. Nothing else.
```

> The model **need not “write” JSON reliably** — you enforce it with the schema/grammar in §4. The prompt just tells it what each field *means*.

---

## 2. World-init prompt (`init_world`)

```
Dream the opening of a short interactive tale set in Thousand Token Wood.

Player's chosen vibe: {vibe}        (e.g. cozy folktale / eerie / absurd / melancholy)

Produce:
  • style_guide: 2–3 sentences fixing the WORLD'S LOOK and TONE so everything stays coherent
    (art style, palette, mood, the kind of language spirits use). This will rarely change.
  • opening scene: a place (short name + a vivid one-paragraph description + a mood).
  • first spirit: the first character the wanderer meets — id, name, a delightful one_line, a
    STABLE appearance description (used for every picture of them), a voice, 2–3 traits, a goal.
  • opening line: this spirit's first words to the wanderer (1–3 sentences, fully in voice).

Make it inviting and a little strange. Output a single JSON object matching the init schema.
```

(Use a sibling JSON schema for init — same `Character`/`Scene` shapes as the directive schema, plus `style_guide` and `opening_line`.)

---

## 3. Memory-compaction prompt (`compact_memory`)

```
Here is the running summary of the tale so far, then the most recent exchanges.
Rewrite them into ONE updated summary of 3–6 sentences. Keep only what matters for continuity:
who the wanderer has met, promises made, places visited, shifts in feeling, unresolved threads.
Drop small talk and exact wording. Write it as a calm narrator's memory. Plain prose, no lists.

SUMMARY SO FAR:
{summary}

RECENT EXCHANGES:
{recent_turns}
```

---

## 4. Constraining the output

### 4.1 Recommended: JSON Schema (let `llama-cpp-python` convert it)

The practical path — pass the schema and let the runtime build the grammar for you:

```python
# llama-cpp-python
out = llm.create_chat_completion(
    messages=[{"role": "system", "content": DIRECTOR_PROMPT},
              {"role": "user", "content": assembled_context}],
    response_format={"type": "json_object", "schema": DIRECTIVE_SCHEMA},
    temperature=0.7, top_p=0.9, max_tokens=512,
)
```

`DIRECTIVE_SCHEMA`:

```json
{
  "type": "object",
  "additionalProperties": false,
  "required": ["speaker", "dialogue", "emotion", "directives"],
  "properties": {
    "speaker": { "type": "string" },
    "dialogue": { "type": "string" },
    "emotion": { "type": "string" },
    "directives": {
      "type": "object",
      "additionalProperties": false,
      "required": ["relationship_delta","advance_beat"],
      "properties": {
        "scene_change": {
          "type": ["object","null"],
          "additionalProperties": false,
          "required": ["place","description","mood"],
          "properties": {
            "place": {"type":"string"},
            "description": {"type":"string"},
            "mood": {"type":"string"}
          }
        },
        "new_character": {
          "type": ["object","null"],
          "additionalProperties": false,
          "required": ["id","name","one_line","appearance","voice","traits","goals"],
          "properties": {
            "id": {"type":"string"},
            "name": {"type":"string"},
            "one_line": {"type":"string"},
            "appearance": {"type":"string"},
            "voice": {"type":"string"},
            "traits": {"type":"array","items":{"type":"string"}},
            "goals": {"type":"string"}
          }
        },
        "exit_character": { "type": ["string","null"] },
        "relationship_delta": { "type": "integer" },
        "set_flags": { "type": "object", "additionalProperties": {"type":"string"} },
        "advance_beat": { "type": "boolean" },
        "ending": {
          "type": ["object","null"],
          "additionalProperties": false,
          "required": ["kind","text"],
          "properties": {
            "kind": {"type":"string","enum":["warm","bittersweet","strange"]},
            "text": {"type":"string"}
          }
        }
      }
    }
  }
}
```

> Numeric *ranges* (e.g. `relationship_delta` ∈ [-100,100]) aren’t enforced by grammar — **clamp them in `state.py`** when applying. Same for unknown `speaker`/`exit_character` ids: validate against present characters and ignore/repair if invalid.

### 4.2 Advanced: hand-written GBNF (raw `llama.cpp` / full control)

Equivalent grammar if you drive `llama.cpp` directly (e.g. `--grammar-file`). Tweak to your build if needed.

```gbnf
root        ::= ws "{" ws
                "\"speaker\""  ws ":" ws string ws "," ws
                "\"dialogue\"" ws ":" ws string ws "," ws
                "\"emotion\""  ws ":" ws emotion ws "," ws
                "\"directives\"" ws ":" ws directives ws
                "}" ws

directives  ::= "{" ws
                "\"scene_change\""       ws ":" ws (scene | "null") ws "," ws
                "\"new_character\""      ws ":" ws (character | "null") ws "," ws
                "\"exit_character\""     ws ":" ws (string | "null") ws "," ws
                "\"relationship_delta\"" ws ":" ws integer ws "," ws
                "\"set_flags\""          ws ":" ws flagobj ws "," ws
                "\"advance_beat\""       ws ":" ws boolean ws "," ws
                "\"ending\""             ws ":" ws (ending | "null") ws
                "}"

scene       ::= "{" ws
                "\"place\""       ws ":" ws string ws "," ws
                "\"description\"" ws ":" ws string ws "," ws
                "\"mood\""        ws ":" ws string ws "}"

character   ::= "{" ws
                "\"id\""         ws ":" ws string ws "," ws
                "\"name\""       ws ":" ws string ws "," ws
                "\"one_line\""   ws ":" ws string ws "," ws
                "\"appearance\"" ws ":" ws string ws "," ws
                "\"voice\""      ws ":" ws string ws "," ws
                "\"traits\""     ws ":" ws strlist ws "," ws
                "\"goals\""      ws ":" ws string ws "}"

ending      ::= "{" ws
                "\"kind\"" ws ":" ws ("\"warm\"" | "\"bittersweet\"" | "\"strange\"") ws "," ws
                "\"text\"" ws ":" ws string ws "}"

emotion     ::= string

flagobj     ::= "{" ws ( flagpair ( ws "," ws flagpair )* ws )? "}"
flagpair    ::= string ws ":" ws string
strlist     ::= "[" ws ( string ( ws "," ws string )* ws )? "]"

string      ::= "\"" ( [^"\\] | "\\" (["\\/bfnrt] | "u" hex hex hex hex) )* "\""
integer     ::= "-"? ("0" | [1-9] [0-9]*)
boolean     ::= "true" | "false"
hex         ::= [0-9a-fA-F]
ws          ::= [ \t\n]*
```

---

## 5. Image-prompt composition (the Painter)

Don’t ask the LLM to “write a Stable Diffusion prompt” from scratch — **compose it in code** from fields you already trust, so the art style stays locked:

```python
def backdrop_prompt(state, scene):
    return f"{state.style_guide}, {scene.description}, {scene.mood} atmosphere, " \
           f"background scenery, no characters, no text"

def sprite_prompt(state, ch, mood):
    return f"{state.style_guide}, full-body character, {ch.appearance}, " \
           f"{mood} expression, plain neutral background, no text"
# negative (SDXL): "text, watermark, lowres, deformed hands, extra limbs, multiple characters"
# pin seed = ch.sprite_seed (sprite) / scene.backdrop_seed (backdrop) for consistency
```

For **mood swaps** with FLUX.2 Klein, condition on the cached base sprite instead of regenerating:
“*the same character, now {mood} — keep identity, outfit, and style identical*.”

---

## 6. Sampling guidance

| Call | temp | top_p | notes |
|---|---|---|---|
| Director (dialogue+directives) | 0.7 | 0.9 | grammar-constrained; warm but on-rails. If voices feel flat, raise temp slightly. |
| World init | 0.9 | 0.95 | want surprise and texture here |
| Memory compaction | 0.3 | 0.9 | want faithful, terse summary |
| Image-prompt | — | — | composed in code, no sampling |

Add a light `repeat_penalty` (~1.05–1.1) to fight the “AI slop” repetition small models drift into. Use `max_tokens≈1024` for the director call — actions like `scout` (which must fill a full `new_character` object) need the extra budget.