AI-Puppet-Theater / docs /LEARNING_PATH.md
ShubhamSetia's picture
fetch-initial-data-from-llm (#7)
ed151e4
|
Raw
History Blame Contribute Delete
11.9 kB
# Learning path: AI Puppet Theater
This guide walks a new contributor from Python and LLM basics through this repository so you can **trace one user action end-to-end** and **change the engine or UI safely**.
Companion docs: [AGENTS.md](../AGENTS.md) (agent and product rules), [README.md](../README.md) (features and architecture images under `assets/`).
---
## Phase 0: Environment and vocabulary
### Run the app
```bash
uv sync
uv run python app.py
# or, for reload while editing:
uv run gradio app.py
```
Open `http://127.0.0.1:7860`. Create a show, use **Run One Beat**, throw a prop, open **Behind the Curtain**. You will see the same concepts the code names: session, beats, director log, trace.
### Python building blocks used here
- **Modules and imports** β€” `app.py` imports a small public API from `puppet_theater` (see [`puppet_theater/__init__.py`](../puppet_theater/__init__.py)).
- **`dataclasses`** β€” mutable β€œworld state”: [`Actor`](../puppet_theater/models.py), [`Beat`](../puppet_theater/models.py), [`TheaterSession`](../puppet_theater/models.py).
- **Pydantic `BaseModel`** β€” validated β€œmessages” often produced from LLM text: [`DirectorDecision`](../puppet_theater/models.py), [`ActorResponse`](../puppet_theater/models.py), [`ToolRequest`](../puppet_theater/models.py).
- **Type hints** β€” e.g. `str | None`, `list[Beat]`.
- **In-place session updates** β€” most functions take `TheaterSession`, mutate it, and return the same object (not a copy).
### LLM vocabulary in this project
- **Prompt** β€” text sent to a model; see [`puppet_theater/prompts.py`](../puppet_theater/prompts.py).
- **Structured output** β€” the app expects JSON-shaped answers, parses them, and validates with Pydantic. Bad JSON or timeouts β†’ **fallback** to deterministic templates so the Space always runs.
---
## Phase 1: The nouns of the show (`models.py`)
Read [`puppet_theater/models.py`](../puppet_theater/models.py) end-to-end.
### TheaterSession vs Beat
| | **`TheaterSession`** | **`Beat`** |
|---|----------------------|------------|
| **Role** | The entire live show: cast, configuration, pacing budget, transcript history, logs, trace. | One row of dialogue / stage moment in the transcript. |
| **Mutability** | Updated every audience action and every beat (beat counter, actors’ mood, etc.). | Created once per beat and appended to `session.transcript`; treat as immutable after append. |
| **Contains** | `actors`, `beat_index`, `min_beats` / `target_beats` / `max_beats`, `transcript`, `latest_prop`, `director_log`, `trace_events`, `backend_name`, `director_mode`, … | `speaker`, `intent`, `line`, `emotion`, `gesture`, `stage_effect`, optional `memory_update` and `tool_request`. |
**Checkpoint:** Explain aloud: β€œThe session is the notebook; each beat is one line the puppets spoke.”
### Pydantic vs dataclass in this file
- **Pydantic** β€” strict validation for fields that might come from model JSON (length limits, non-empty strings).
- **Dataclass** β€” convenient structured bags for runtime state the code controls directly.
Public exports are listed in [`puppet_theater/__init__.py`](../puppet_theater/__init__.py).
---
## Phase 2: Creating a show and audience actions
1. **[`puppet_theater/session.py`](../puppet_theater/session.py)** β€” [`create_show_from_premise`](../puppet_theater/session.py) builds `show_title`, `setting`, the default three [`Actor`](../puppet_theater/models.py) instances, beat budget from [`resolve_show_length`](../puppet_theater/session.py), copies backend/director/temperature settings onto the session, seeds `director_log`, and records trace events (`show_created`, `actors_created`, `director_plan_created`).
2. **[`puppet_theater/actions.py`](../puppet_theater/actions.py)** β€” Audience mutators:
- [`throw_prop`](../puppet_theater/actions.py) β€” appends to `session.props`, sets `latest_prop`, updates `latest_audience_action` and `director_log`, traces `audience_action`.
- [`summon_actor`](../puppet_theater/actions.py) β€” appends a new `Actor` (cap `MAX_ACTORS`), or logs a skip if full.
- [`request_finale`](../puppet_theater/actions.py) β€” sets `finale_requested` so the Director policy can force a finale.
**Checkpoint:** Trace β€œthrow rubber duck” from UI β†’ `throw_prop` β†’ `session.latest_prop` β†’ next Director beat may set `uses_prop=True` (see deterministic policy when `latest_prop` is set in [`DirectorPolicy.decide`](../puppet_theater/director.py)).
---
## Phase 3: One beat β€” the main pipeline (`director.py`)
Read [`puppet_theater/director.py`](../puppet_theater/director.py) with focus on:
- [`story_progress`](../puppet_theater/director.py) / [`story_phase`](../puppet_theater/director.py) β€” pacing helpers from `beat_index` and `target_beats`.
- [`choose_director_decision`](../puppet_theater/director.py) β€” branches on `session.director_mode`: `hf_api` β†’ `HFAPIDirectorPolicy`, `openbmb` β†’ `OpenBMBDirectorPolicy`, else [`DirectorPolicy`](../puppet_theater/director.py) (deterministic).
- [`run_one_beat`](../puppet_theater/director.py) β€” orchestrates one turn:
1. Skip if `beat_index >= max_beats`.
2. Call `choose_director_decision` β†’ [`DirectorDecision`](../puppet_theater/models.py).
3. Resolve speaker; optionally attach `latest_prop` if decision uses prop.
4. Append director log lines and `add_trace_event(..., "director_decision", ...)`.
5. [`generate_actor_response`](../puppet_theater/backends.py) (from `backends`) β†’ [`ActorResponse`](../puppet_theater/models.py).
6. Increment `beat_index`, build [`Beat`](../puppet_theater/models.py), append to `transcript`.
7. [`apply_actor_state_update`](../puppet_theater/director.py) β€” mood, memory, secret status, held props.
8. [`run_actor_tool_request`](../puppet_theater/tools.py) β€” optional tool side effects; may adjust `beat.stage_effect`.
9. Clear `latest_prop` if consumed; more logs and trace (`beat_added`, `actor_response`, …).
10. If finale: clamp `beat_index` to `max_beats`, set `finale_requested`, trace `scene_completed`.
- [`run_full_act`](../puppet_theater/director.py) β€” `while session.beat_index < session.max_beats: run_one_beat(session)`.
```mermaid
flowchart LR
subgraph beat [run_one_beat]
D[choose_director_decision]
A[generate_actor_response]
T[append Beat to transcript]
S[apply_actor_state_update]
U[run_actor_tool_request]
end
Session[TheaterSession]
Session --> D
D --> A
A --> T
T --> S
S --> U
U --> Session
```
**Checkpoint:** Without opening `app.py`, narrate one beat from Director decision through transcript line.
---
## Phase 4: Backends β€” where the LLM lives (`backends.py`)
Read [`puppet_theater/backends.py`](../puppet_theater/backends.py) with this map:
- **`ModelBackend`** β€” abstract [`generate_actor_response`](../puppet_theater/backends.py) per backend.
- **Implementations** β€” [`DeterministicBackend`](../puppet_theater/backends.py) (always works), [`OpenBMBTransformersBackend`](../puppet_theater/backends.py) (local Transformers; ZeroGPU hooks in [`puppet_theater/zerogpu.py`](../puppet_theater/zerogpu.py)), [`HFAPIBackend`](../puppet_theater/backends.py) (hosted inference).
- **[`generate_actor_response`](../puppet_theater/backends.py)** (module-level) β€” picks backend from `session.backend_name`, runs generation, [`parse_actor_output`](../puppet_theater/backends.py), optional repair, then deterministic fallback on failure (mirrors Director policies’ try/validate/fallback pattern).
**Checkpoint:** Why does the demo run without an HF token? Because `backend_name` and `director_mode` default to `deterministic`, which never calls the network.
Optional deep read: [`puppet_theater/prompts.py`](../puppet_theater/prompts.py).
---
## Phase 5: Tools and trace
### Tools ([`puppet_theater/tools.py`](../puppet_theater/tools.py))
- [`ALLOWED_TOOL_NAMES`](../puppet_theater/tools.py) β€” allowlist: `inspect_prop`, `consult_stage_oracle`, `change_lighting`.
- [`validate_tool_request`](../puppet_theater/tools.py) β€” Pydantic + argument shape checks.
- [`run_actor_tool_request`](../puppet_theater/tools.py) β€” traces `tool_requested` / `tool_ignored` / `tool_executed` / `tool_result`, updates `session.latest_tool_result` and `recent_tool_results`.
### Trace ([`puppet_theater/trace.py`](../puppet_theater/trace.py))
- [`add_trace_event`](../puppet_theater/trace.py) β€” appends sanitized dicts to `session.trace_events` (no secrets; path stripping in [`sanitize_value`](../puppet_theater/trace.py)). Each event uses the key **`event_type`** (not `type`) for the event name string.
- [`export_trace`](../puppet_theater/trace.py) / [`render_trace_json`](../puppet_theater/trace.py) / [`write_trace_json_file`](../puppet_theater/trace.py) β€” used by the UI for download and display.
**Checkpoint:** If you add a new `event_type` in `add_trace_event`, you can find it in the trace JSON and in Gradio components wired to `render_trace`.
---
## Phase 6: Gradio UI wiring (`app.py`)
Do **not** read [`app.py`](../app.py) top to bottom (it is large). Use this map.
### `gr.Blocks` handlers β†’ Python wrappers β†’ `puppet_theater`
| UI control | Handler in `app.py` | Calls into `puppet_theater` |
|------------|---------------------|-----------------------------|
| Create show | [`create_show`](../app.py) (~2607) | [`create_show_from_premise`](../puppet_theater/session.py) |
| Run one beat | [`advance_one_beat`](../app.py) (~2691) | [`run_one_beat`](../puppet_theater/director.py) (after [`apply_backend_selection`](../app.py)) |
| Run full act | [`advance_full_act`](../app.py) (~2720) | Loop / yield calling [`run_one_beat`](../puppet_theater/director.py) |
| Throw prop | [`throw_audience_prop`](../app.py) (~2790) | [`throw_prop`](../puppet_theater/actions.py) |
| Summon actor | [`summon_audience_actor`](../app.py) (~2820) | [`summon_actor`](../puppet_theater/actions.py) |
| Request finale | [`request_audience_finale`](../app.py) (~2850) | [`request_finale`](../puppet_theater/actions.py) |
| Warm up OpenBMB | [`warm_up_backend`](../app.py) (~2879) | [`warm_up_openbmb`](../puppet_theater/backends.py) |
| Reset | [`reset_show`](../app.py) (~2663) | Clears state (no package call) |
Event wiring lives near **`create_button.click`**, **`run_one_button.click`**, etc. (~3119–3290 in `app.py` at time of writing; line numbers may shift).
Shared presentation pipeline: [`render_outputs`](../app.py) (~2590) refreshes stage HTML, transcript, TTS payload, director log, trace, backend panel.
**Checkpoint:** Name the handler that runs when **Run One Beat** is clicked and the single core engine function it invokes.
---
## Phase 7: Verify and contribute
### Commands
```bash
uv run python -m py_compile app.py puppet_theater/*.py
uv run pytest
```
### Tests to read first
- [`tests/test_director.py`](../tests/test_director.py) β€” Director and beat behavior.
- [`tests/conftest.py`](../tests/conftest.py) β€” shared fixtures.
### First contribution ideas
- Copyedit a `director_log` string.
- Extend deterministic dialogue or stage effects in [`backends.py`](../puppet_theater/backends.py) / [`director.py`](../puppet_theater/director.py).
- Add a trace field or a test for a prop-heavy beat.
- Document an env var in [README.md](../README.md).
### Optional advanced track
[`finetune/`](../finetune/) β€” LoRA training and eval scripts; separate from the live Gradio beat loop.
---
## Suggested schedule
| Week | Focus |
|------|--------|
| 1 | Phases 0–2 + UI play |
| 2 | Phase 3 (`run_one_beat`) until you can draw the diagram from memory |
| 3 | Phases 4–5 |
| 4 | Phase 6–7 + a small PR |
---
## External references
- Gradio **Blocks** and `.click()` inputs/outputs.
- Pydantic v2 **validators** (`field_validator`).
- Hugging Face **Spaces** environment variables (see README Configuration).