Rifqi Hafizuddin Claude Opus 4.8 commited on
Commit
ba2fa88
·
1 Parent(s): 72306d0

[KM-567] docs: record Phase 3 Planner agent in PROGRESS.md

Browse files

Add "What just shipped (2026-06-05 — Phase 3: Planner agent)" section: files
added under src/agents/planner/, the stub contracts pending reconciliation with
the lead (BusinessContext) and tool team (KM-608), and the next steps
(Orchestrator expansion + TaskRunner + Assembler).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Files changed (1) hide show
  1. PROGRESS.md +40 -1
PROGRESS.md CHANGED
@@ -2,11 +2,50 @@
2
 
3
  Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team — division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
4
 
5
- **Last updated**: 2026-05-12 ([NOTICKET] Cleanup PR landed: ChatHandler wired to chat.py, Phase 1 dual-write dropped from /ingest, on_catalog_rebuild_requested implemented, dead modules deleted, answer_agent→chatbot renamed, retrieval cache restored via RetrievalRouter, top_values added to ColumnStats, lifespan migration, knowledge_router removed)
6
  **Current open PR**: `pr/1` — active. Cleanup PR committed and pushed.
7
 
8
  ---
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ## Legend
11
 
12
  - `[x]` done and merged
 
2
 
3
  Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team — division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
4
 
5
+ **Last updated**: 2026-06-05 (Phase 3 deliverable #2: Planner agent built under `src/agents/planner/` see "What just shipped" below)
6
  **Current open PR**: `pr/1` — active. Cleanup PR committed and pushed.
7
 
8
  ---
9
 
10
+ ## What just shipped (2026-06-05 — Phase 3: Planner agent)
11
+
12
+ First slow-path agent from `AGENT_ARCHITECTURE_CONTEXT_new.md` §7.3. A single LLM
13
+ call turns BusinessContext + Catalog + ToolRegistry + question + Constraints into a
14
+ validated, **static** `TaskList` (DAG of fully-specified tool-call chains). No
15
+ replanning (INV-6); tool-agnostic against a registry contract (INV-7). Fast path
16
+ (`agents/orchestration.py`, `agents/chatbot.py`, `query/`) untouched.
17
+
18
+ **Files added** (`src/agents/planner/`):
19
+ - `contracts.py` — **STUB** Pydantic contracts pending reconciliation: `BusinessContext`
20
+ (+KeyTerm/DataTableNote/DataColumnNote, lead's §7.1), `ToolSpec`/`ToolRegistry` (tool
21
+ team KM-608, §9.2), `ToolOutput` envelope (§8.1).
22
+ - `schemas.py` — `CrispStage`, `ToolCall`, `Task`, `TaskList` (§7.3). No replan schemas.
23
+ - `inputs.py` — `CatalogSummary` (condensed, PII `sample_values` nulled, `from_catalog`
24
+ builder + `render`) and `Constraints` (max_tasks=5, modeling_allowed=False).
25
+ - `registry.py` — **STUB** v1 P0 registry: query_structured, retrieve_documents,
26
+ list_sources, describe_source, compute_median/stddev/percentile/mode, date_trunc.
27
+ - `errors.py` — `PlannerError`, `PlannerValidationError`.
28
+ - `prompt.py` + `config/prompts/planner.md` — system prompt (INV-1/6/7 + principles) +
29
+ per-call human content (context + catalog + tools + constraints + few-shots + question).
30
+ - `examples.py` — two few-shots (A exploratory revenue-by-category; B descriptive
31
+ monthly-trend-by-region with date_trunc), built from the real `TaskList` schema.
32
+ - `validator.py` — `PlannerValidator` running the 8 checks (§7.3); reuses the existing
33
+ `IRValidator` for inline `query_structured` IRs.
34
+ - `service.py` — `PlannerService` + `plan_analysis(...)`: chain (mirrors
35
+ `query/planner/service.py`) + validate-and-retry loop (max 3, mirrors `QueryService`).
36
+
37
+ **Tests added** (`tests/agents/planner/`, 30 passing + 1 gated): `test_schemas.py`,
38
+ `test_inputs.py`, `test_validator.py` (one failure per check + happy paths),
39
+ `test_service.py` (`_FakeChain` + retry), `test_golden_questions.py` (live eval gated on
40
+ `RUN_PLANNER_EVAL=1`). `ruff check` clean on planner paths.
41
+
42
+ **Open follow-ups (not blockers):** reconcile `BusinessContext` with the lead and
43
+ `ToolRegistry`/`ToolSpec` + real tools with teammate (KM-608); "GPT mini" currently uses
44
+ the configured 4o deployment (swap `azure_deployment` when a mini deployment exists). Next
45
+ per the architecture doc: Orchestrator slow-path expansion + TaskRunner + Assembler.
46
+
47
+ ---
48
+
49
  ## Legend
50
 
51
  - `[x]` done and merged