# Learning path: AI Puppet Theater

This guide walks a new contributor from Python and LLM basics through this repository so you can **trace one user action end-to-end** and **change the engine or UI safely**.

Companion docs: [AGENTS.md](../AGENTS.md) (agent and product rules), [README.md](../README.md) (features and architecture images under `assets/`).

---

## Phase 0: Environment and vocabulary

### Run the app

```bash
uv sync
uv run python app.py
# or, for reload while editing:
uv run gradio app.py
```

Open `http://127.0.0.1:7860`. Create a show, use **Run One Beat**, throw a prop, open **Behind the Curtain**. You will see the same concepts the code names: session, beats, director log, trace.

### Python building blocks used here

- **Modules and imports** — `app.py` imports a small public API from `puppet_theater` (see [`puppet_theater/__init__.py`](../puppet_theater/__init__.py)).
- **`dataclasses`** — mutable “world state”: [`Actor`](../puppet_theater/models.py), [`Beat`](../puppet_theater/models.py), [`TheaterSession`](../puppet_theater/models.py).
- **Pydantic `BaseModel`** — validated “messages” often produced from LLM text: [`DirectorDecision`](../puppet_theater/models.py), [`ActorResponse`](../puppet_theater/models.py), [`ToolRequest`](../puppet_theater/models.py).
- **Type hints** — e.g. `str | None`, `list[Beat]`.
- **In-place session updates** — most functions take `TheaterSession`, mutate it, and return the same object (not a copy).

### LLM vocabulary in this project

- **Prompt** — text sent to a model; see [`puppet_theater/prompts.py`](../puppet_theater/prompts.py).
- **Structured output** — the app expects JSON-shaped answers, parses them, and validates with Pydantic. Bad JSON or timeouts → **fallback** to deterministic templates so the Space always runs.

---

## Phase 1: The nouns of the show (`models.py`)

Read [`puppet_theater/models.py`](../puppet_theater/models.py) end-to-end.

### TheaterSession vs Beat

| | **`TheaterSession`** | **`Beat`** |
|---|----------------------|------------|
| **Role** | The entire live show: cast, configuration, pacing budget, transcript history, logs, trace. | One row of dialogue / stage moment in the transcript. |
| **Mutability** | Updated every audience action and every beat (beat counter, actors’ mood, etc.). | Created once per beat and appended to `session.transcript`; treat as immutable after append. |
| **Contains** | `actors`, `beat_index`, `min_beats` / `target_beats` / `max_beats`, `transcript`, `latest_prop`, `director_log`, `trace_events`, `backend_name`, `director_mode`, … | `speaker`, `intent`, `line`, `emotion`, `gesture`, `stage_effect`, optional `memory_update` and `tool_request`. |

**Checkpoint:** Explain aloud: “The session is the notebook; each beat is one line the puppets spoke.”

### Pydantic vs dataclass in this file

- **Pydantic** — strict validation for fields that might come from model JSON (length limits, non-empty strings).
- **Dataclass** — convenient structured bags for runtime state the code controls directly.

Public exports are listed in [`puppet_theater/__init__.py`](../puppet_theater/__init__.py).

---

## Phase 2: Creating a show and audience actions

1. **[`puppet_theater/session.py`](../puppet_theater/session.py)** — [`create_show_from_premise`](../puppet_theater/session.py) builds `show_title`, `setting`, the default three [`Actor`](../puppet_theater/models.py) instances, beat budget from [`resolve_show_length`](../puppet_theater/session.py), copies backend/director/temperature settings onto the session, seeds `director_log`, and records trace events (`show_created`, `actors_created`, `director_plan_created`).

2. **[`puppet_theater/actions.py`](../puppet_theater/actions.py)** — Audience mutators:
   - [`throw_prop`](../puppet_theater/actions.py) — appends to `session.props`, sets `latest_prop`, updates `latest_audience_action` and `director_log`, traces `audience_action`.
   - [`summon_actor`](../puppet_theater/actions.py) — appends a new `Actor` (cap `MAX_ACTORS`), or logs a skip if full.
   - [`request_finale`](../puppet_theater/actions.py) — sets `finale_requested` so the Director policy can force a finale.

**Checkpoint:** Trace “throw rubber duck” from UI → `throw_prop` → `session.latest_prop` → next Director beat may set `uses_prop=True` (see deterministic policy when `latest_prop` is set in [`DirectorPolicy.decide`](../puppet_theater/director.py)).

---

## Phase 3: One beat — the main pipeline (`director.py`)

Read [`puppet_theater/director.py`](../puppet_theater/director.py) with focus on:

- [`story_progress`](../puppet_theater/director.py) / [`story_phase`](../puppet_theater/director.py) — pacing helpers from `beat_index` and `target_beats`.
- [`choose_director_decision`](../puppet_theater/director.py) — branches on `session.director_mode`: `hf_api` → `HFAPIDirectorPolicy`, `openbmb` → `OpenBMBDirectorPolicy`, else [`DirectorPolicy`](../puppet_theater/director.py) (deterministic).
- [`run_one_beat`](../puppet_theater/director.py) — orchestrates one turn:
  1. Skip if `beat_index >= max_beats`.
  2. Call `choose_director_decision` → [`DirectorDecision`](../puppet_theater/models.py).
  3. Resolve speaker; optionally attach `latest_prop` if decision uses prop.
  4. Append director log lines and `add_trace_event(..., "director_decision", ...)`.
  5. [`generate_actor_response`](../puppet_theater/backends.py) (from `backends`) → [`ActorResponse`](../puppet_theater/models.py).
  6. Increment `beat_index`, build [`Beat`](../puppet_theater/models.py), append to `transcript`.
  7. [`apply_actor_state_update`](../puppet_theater/director.py) — mood, memory, secret status, held props.
  8. [`run_actor_tool_request`](../puppet_theater/tools.py) — optional tool side effects; may adjust `beat.stage_effect`.
  9. Clear `latest_prop` if consumed; more logs and trace (`beat_added`, `actor_response`, …).
  10. If finale: clamp `beat_index` to `max_beats`, set `finale_requested`, trace `scene_completed`.

- [`run_full_act`](../puppet_theater/director.py) — `while session.beat_index < session.max_beats: run_one_beat(session)`.

```mermaid
flowchart LR
  subgraph beat [run_one_beat]
    D[choose_director_decision]
    A[generate_actor_response]
    T[append Beat to transcript]
    S[apply_actor_state_update]
    U[run_actor_tool_request]
  end
  Session[TheaterSession]
  Session --> D
  D --> A
  A --> T
  T --> S
  S --> U
  U --> Session
```

**Checkpoint:** Without opening `app.py`, narrate one beat from Director decision through transcript line.

---

## Phase 4: Backends — where the LLM lives (`backends.py`)

Read [`puppet_theater/backends.py`](../puppet_theater/backends.py) with this map:

- **`ModelBackend`** — abstract [`generate_actor_response`](../puppet_theater/backends.py) per backend.
- **Implementations** — [`DeterministicBackend`](../puppet_theater/backends.py) (always works), [`OpenBMBTransformersBackend`](../puppet_theater/backends.py) (local Transformers; ZeroGPU hooks in [`puppet_theater/zerogpu.py`](../puppet_theater/zerogpu.py)), [`HFAPIBackend`](../puppet_theater/backends.py) (hosted inference).
- **[`generate_actor_response`](../puppet_theater/backends.py)** (module-level) — picks backend from `session.backend_name`, runs generation, [`parse_actor_output`](../puppet_theater/backends.py), optional repair, then deterministic fallback on failure (mirrors Director policies’ try/validate/fallback pattern).

**Checkpoint:** Why does the demo run without an HF token? Because `backend_name` and `director_mode` default to `deterministic`, which never calls the network.

Optional deep read: [`puppet_theater/prompts.py`](../puppet_theater/prompts.py).

---

## Phase 5: Tools and trace

### Tools ([`puppet_theater/tools.py`](../puppet_theater/tools.py))

- [`ALLOWED_TOOL_NAMES`](../puppet_theater/tools.py) — allowlist: `inspect_prop`, `consult_stage_oracle`, `change_lighting`.
- [`validate_tool_request`](../puppet_theater/tools.py) — Pydantic + argument shape checks.
- [`run_actor_tool_request`](../puppet_theater/tools.py) — traces `tool_requested` / `tool_ignored` / `tool_executed` / `tool_result`, updates `session.latest_tool_result` and `recent_tool_results`.

### Trace ([`puppet_theater/trace.py`](../puppet_theater/trace.py))

- [`add_trace_event`](../puppet_theater/trace.py) — appends sanitized dicts to `session.trace_events` (no secrets; path stripping in [`sanitize_value`](../puppet_theater/trace.py)). Each event uses the key **`event_type`** (not `type`) for the event name string.
- [`export_trace`](../puppet_theater/trace.py) / [`render_trace_json`](../puppet_theater/trace.py) / [`write_trace_json_file`](../puppet_theater/trace.py) — used by the UI for download and display.

**Checkpoint:** If you add a new `event_type` in `add_trace_event`, you can find it in the trace JSON and in Gradio components wired to `render_trace`.

---

## Phase 6: Gradio UI wiring (`app.py`)

Do **not** read [`app.py`](../app.py) top to bottom (it is large). Use this map.

### `gr.Blocks` handlers → Python wrappers → `puppet_theater`

| UI control | Handler in `app.py` | Calls into `puppet_theater` |
|------------|---------------------|-----------------------------|
| Create show | [`create_show`](../app.py) (~2607) | [`create_show_from_premise`](../puppet_theater/session.py) |
| Run one beat | [`advance_one_beat`](../app.py) (~2691) | [`run_one_beat`](../puppet_theater/director.py) (after [`apply_backend_selection`](../app.py)) |
| Run full act | [`advance_full_act`](../app.py) (~2720) | Loop / yield calling [`run_one_beat`](../puppet_theater/director.py) |
| Throw prop | [`throw_audience_prop`](../app.py) (~2790) | [`throw_prop`](../puppet_theater/actions.py) |
| Summon actor | [`summon_audience_actor`](../app.py) (~2820) | [`summon_actor`](../puppet_theater/actions.py) |
| Request finale | [`request_audience_finale`](../app.py) (~2850) | [`request_finale`](../puppet_theater/actions.py) |
| Warm up OpenBMB | [`warm_up_backend`](../app.py) (~2879) | [`warm_up_openbmb`](../puppet_theater/backends.py) |
| Reset | [`reset_show`](../app.py) (~2663) | Clears state (no package call) |

Event wiring lives near **`create_button.click`**, **`run_one_button.click`**, etc. (~3119–3290 in `app.py` at time of writing; line numbers may shift).

Shared presentation pipeline: [`render_outputs`](../app.py) (~2590) refreshes stage HTML, transcript, TTS payload, director log, trace, backend panel.

**Checkpoint:** Name the handler that runs when **Run One Beat** is clicked and the single core engine function it invokes.

---

## Phase 7: Verify and contribute

### Commands

```bash
uv run python -m py_compile app.py puppet_theater/*.py
uv run pytest
```

### Tests to read first

- [`tests/test_director.py`](../tests/test_director.py) — Director and beat behavior.
- [`tests/conftest.py`](../tests/conftest.py) — shared fixtures.

### First contribution ideas

- Copyedit a `director_log` string.
- Extend deterministic dialogue or stage effects in [`backends.py`](../puppet_theater/backends.py) / [`director.py`](../puppet_theater/director.py).
- Add a trace field or a test for a prop-heavy beat.
- Document an env var in [README.md](../README.md).

### Optional advanced track

[`finetune/`](../finetune/) — LoRA training and eval scripts; separate from the live Gradio beat loop.

---

## Suggested schedule

| Week | Focus |
|------|--------|
| 1 | Phases 0–2 + UI play |
| 2 | Phase 3 (`run_one_beat`) until you can draw the diagram from memory |
| 3 | Phases 4–5 |
| 4 | Phase 6–7 + a small PR |

---

## External references

- Gradio **Blocks** and `.click()` inputs/outputs.
- Pydantic v2 **validators** (`field_validator`).
- Hugging Face **Spaces** environment variables (see README Configuration).