Spaces:
Running on Zero
A newer version of the Gradio SDK is available: 6.19.0
Learning path: AI Puppet Theater
This guide walks a new contributor from Python and LLM basics through this repository so you can trace one user action end-to-end and change the engine or UI safely.
Companion docs: AGENTS.md (agent and product rules), README.md (features and architecture images under assets/).
Phase 0: Environment and vocabulary
Run the app
uv sync
uv run python app.py
# or, for reload while editing:
uv run gradio app.py
Open http://127.0.0.1:7860. Create a show, use Run One Beat, throw a prop, open Behind the Curtain. You will see the same concepts the code names: session, beats, director log, trace.
Python building blocks used here
- Modules and imports β
app.pyimports a small public API frompuppet_theater(seepuppet_theater/__init__.py). dataclassesβ mutable βworld stateβ:Actor,Beat,TheaterSession.- Pydantic
BaseModelβ validated βmessagesβ often produced from LLM text:DirectorDecision,ActorResponse,ToolRequest. - Type hints β e.g.
str | None,list[Beat]. - In-place session updates β most functions take
TheaterSession, mutate it, and return the same object (not a copy).
LLM vocabulary in this project
- Prompt β text sent to a model; see
puppet_theater/prompts.py. - Structured output β the app expects JSON-shaped answers, parses them, and validates with Pydantic. Bad JSON or timeouts β fallback to deterministic templates so the Space always runs.
Phase 1: The nouns of the show (models.py)
Read puppet_theater/models.py end-to-end.
TheaterSession vs Beat
TheaterSession |
Beat |
|
|---|---|---|
| Role | The entire live show: cast, configuration, pacing budget, transcript history, logs, trace. | One row of dialogue / stage moment in the transcript. |
| Mutability | Updated every audience action and every beat (beat counter, actorsβ mood, etc.). | Created once per beat and appended to session.transcript; treat as immutable after append. |
| Contains | actors, beat_index, min_beats / target_beats / max_beats, transcript, latest_prop, director_log, trace_events, backend_name, director_mode, β¦ |
speaker, intent, line, emotion, gesture, stage_effect, optional memory_update and tool_request. |
Checkpoint: Explain aloud: βThe session is the notebook; each beat is one line the puppets spoke.β
Pydantic vs dataclass in this file
- Pydantic β strict validation for fields that might come from model JSON (length limits, non-empty strings).
- Dataclass β convenient structured bags for runtime state the code controls directly.
Public exports are listed in puppet_theater/__init__.py.
Phase 2: Creating a show and audience actions
puppet_theater/session.pyβcreate_show_from_premisebuildsshow_title,setting, the default threeActorinstances, beat budget fromresolve_show_length, copies backend/director/temperature settings onto the session, seedsdirector_log, and records trace events (show_created,actors_created,director_plan_created).puppet_theater/actions.pyβ Audience mutators:throw_propβ appends tosession.props, setslatest_prop, updateslatest_audience_actionanddirector_log, tracesaudience_action.summon_actorβ appends a newActor(capMAX_ACTORS), or logs a skip if full.request_finaleβ setsfinale_requestedso the Director policy can force a finale.
Checkpoint: Trace βthrow rubber duckβ from UI β throw_prop β session.latest_prop β next Director beat may set uses_prop=True (see deterministic policy when latest_prop is set in DirectorPolicy.decide).
Phase 3: One beat β the main pipeline (director.py)
Read puppet_theater/director.py with focus on:
story_progress/story_phaseβ pacing helpers frombeat_indexandtarget_beats.choose_director_decisionβ branches onsession.director_mode:hf_apiβHFAPIDirectorPolicy,openbmbβOpenBMBDirectorPolicy, elseDirectorPolicy(deterministic).run_one_beatβ orchestrates one turn:- Skip if
beat_index >= max_beats. - Call
choose_director_decisionβDirectorDecision. - Resolve speaker; optionally attach
latest_propif decision uses prop. - Append director log lines and
add_trace_event(..., "director_decision", ...). generate_actor_response(frombackends) βActorResponse.- Increment
beat_index, buildBeat, append totranscript. apply_actor_state_updateβ mood, memory, secret status, held props.run_actor_tool_requestβ optional tool side effects; may adjustbeat.stage_effect.- Clear
latest_propif consumed; more logs and trace (beat_added,actor_response, β¦). - If finale: clamp
beat_indextomax_beats, setfinale_requested, tracescene_completed.
- Skip if
run_full_actβwhile session.beat_index < session.max_beats: run_one_beat(session).
flowchart LR
subgraph beat [run_one_beat]
D[choose_director_decision]
A[generate_actor_response]
T[append Beat to transcript]
S[apply_actor_state_update]
U[run_actor_tool_request]
end
Session[TheaterSession]
Session --> D
D --> A
A --> T
T --> S
S --> U
U --> Session
Checkpoint: Without opening app.py, narrate one beat from Director decision through transcript line.
Phase 4: Backends β where the LLM lives (backends.py)
Read puppet_theater/backends.py with this map:
ModelBackendβ abstractgenerate_actor_responseper backend.- Implementations β
DeterministicBackend(always works),OpenBMBTransformersBackend(local Transformers; ZeroGPU hooks inpuppet_theater/zerogpu.py),HFAPIBackend(hosted inference). generate_actor_response(module-level) β picks backend fromsession.backend_name, runs generation,parse_actor_output, optional repair, then deterministic fallback on failure (mirrors Director policiesβ try/validate/fallback pattern).
Checkpoint: Why does the demo run without an HF token? Because backend_name and director_mode default to deterministic, which never calls the network.
Optional deep read: puppet_theater/prompts.py.
Phase 5: Tools and trace
Tools (puppet_theater/tools.py)
ALLOWED_TOOL_NAMESβ allowlist:inspect_prop,consult_stage_oracle,change_lighting.validate_tool_requestβ Pydantic + argument shape checks.run_actor_tool_requestβ tracestool_requested/tool_ignored/tool_executed/tool_result, updatessession.latest_tool_resultandrecent_tool_results.
Trace (puppet_theater/trace.py)
add_trace_eventβ appends sanitized dicts tosession.trace_events(no secrets; path stripping insanitize_value). Each event uses the keyevent_type(nottype) for the event name string.export_trace/render_trace_json/write_trace_json_fileβ used by the UI for download and display.
Checkpoint: If you add a new event_type in add_trace_event, you can find it in the trace JSON and in Gradio components wired to render_trace.
Phase 6: Gradio UI wiring (app.py)
Do not read app.py top to bottom (it is large). Use this map.
gr.Blocks handlers β Python wrappers β puppet_theater
| UI control | Handler in app.py |
Calls into puppet_theater |
|---|---|---|
| Create show | create_show (~2607) |
create_show_from_premise |
| Run one beat | advance_one_beat (~2691) |
run_one_beat (after apply_backend_selection) |
| Run full act | advance_full_act (~2720) |
Loop / yield calling run_one_beat |
| Throw prop | throw_audience_prop (~2790) |
throw_prop |
| Summon actor | summon_audience_actor (~2820) |
summon_actor |
| Request finale | request_audience_finale (~2850) |
request_finale |
| Warm up OpenBMB | warm_up_backend (~2879) |
warm_up_openbmb |
| Reset | reset_show (~2663) |
Clears state (no package call) |
Event wiring lives near create_button.click, run_one_button.click, etc. (~3119β3290 in app.py at time of writing; line numbers may shift).
Shared presentation pipeline: render_outputs (~2590) refreshes stage HTML, transcript, TTS payload, director log, trace, backend panel.
Checkpoint: Name the handler that runs when Run One Beat is clicked and the single core engine function it invokes.
Phase 7: Verify and contribute
Commands
uv run python -m py_compile app.py puppet_theater/*.py
uv run pytest
Tests to read first
tests/test_director.pyβ Director and beat behavior.tests/conftest.pyβ shared fixtures.
First contribution ideas
- Copyedit a
director_logstring. - Extend deterministic dialogue or stage effects in
backends.py/director.py. - Add a trace field or a test for a prop-heavy beat.
- Document an env var in README.md.
Optional advanced track
finetune/ β LoRA training and eval scripts; separate from the live Gradio beat loop.
Suggested schedule
| Week | Focus |
|---|---|
| 1 | Phases 0β2 + UI play |
| 2 | Phase 3 (run_one_beat) until you can draw the diagram from memory |
| 3 | Phases 4β5 |
| 4 | Phase 6β7 + a small PR |
External references
- Gradio Blocks and
.click()inputs/outputs. - Pydantic v2 validators (
field_validator). - Hugging Face Spaces environment variables (see README Configuration).