Rifqi Hafizuddin Claude Opus 4.8 commited on
Commit ·
ba2fa88
1
Parent(s): 72306d0
[KM-567] docs: record Phase 3 Planner agent in PROGRESS.md
Browse filesAdd "What just shipped (2026-06-05 — Phase 3: Planner agent)" section: files
added under src/agents/planner/, the stub contracts pending reconciliation with
the lead (BusinessContext) and tool team (KM-608), and the next steps
(Orchestrator expansion + TaskRunner + Assembler).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- PROGRESS.md +40 -1
PROGRESS.md
CHANGED
|
@@ -2,11 +2,50 @@
|
|
| 2 |
|
| 3 |
Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team — division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
|
| 4 |
|
| 5 |
-
**Last updated**: 2026-
|
| 6 |
**Current open PR**: `pr/1` — active. Cleanup PR committed and pushed.
|
| 7 |
|
| 8 |
---
|
| 9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
## Legend
|
| 11 |
|
| 12 |
- `[x]` done and merged
|
|
|
|
| 2 |
|
| 3 |
Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team — division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
|
| 4 |
|
| 5 |
+
**Last updated**: 2026-06-05 (Phase 3 deliverable #2: Planner agent built under `src/agents/planner/` — see "What just shipped" below)
|
| 6 |
**Current open PR**: `pr/1` — active. Cleanup PR committed and pushed.
|
| 7 |
|
| 8 |
---
|
| 9 |
|
| 10 |
+
## What just shipped (2026-06-05 — Phase 3: Planner agent)
|
| 11 |
+
|
| 12 |
+
First slow-path agent from `AGENT_ARCHITECTURE_CONTEXT_new.md` §7.3. A single LLM
|
| 13 |
+
call turns BusinessContext + Catalog + ToolRegistry + question + Constraints into a
|
| 14 |
+
validated, **static** `TaskList` (DAG of fully-specified tool-call chains). No
|
| 15 |
+
replanning (INV-6); tool-agnostic against a registry contract (INV-7). Fast path
|
| 16 |
+
(`agents/orchestration.py`, `agents/chatbot.py`, `query/`) untouched.
|
| 17 |
+
|
| 18 |
+
**Files added** (`src/agents/planner/`):
|
| 19 |
+
- `contracts.py` — **STUB** Pydantic contracts pending reconciliation: `BusinessContext`
|
| 20 |
+
(+KeyTerm/DataTableNote/DataColumnNote, lead's §7.1), `ToolSpec`/`ToolRegistry` (tool
|
| 21 |
+
team KM-608, §9.2), `ToolOutput` envelope (§8.1).
|
| 22 |
+
- `schemas.py` — `CrispStage`, `ToolCall`, `Task`, `TaskList` (§7.3). No replan schemas.
|
| 23 |
+
- `inputs.py` — `CatalogSummary` (condensed, PII `sample_values` nulled, `from_catalog`
|
| 24 |
+
builder + `render`) and `Constraints` (max_tasks=5, modeling_allowed=False).
|
| 25 |
+
- `registry.py` — **STUB** v1 P0 registry: query_structured, retrieve_documents,
|
| 26 |
+
list_sources, describe_source, compute_median/stddev/percentile/mode, date_trunc.
|
| 27 |
+
- `errors.py` — `PlannerError`, `PlannerValidationError`.
|
| 28 |
+
- `prompt.py` + `config/prompts/planner.md` — system prompt (INV-1/6/7 + principles) +
|
| 29 |
+
per-call human content (context + catalog + tools + constraints + few-shots + question).
|
| 30 |
+
- `examples.py` — two few-shots (A exploratory revenue-by-category; B descriptive
|
| 31 |
+
monthly-trend-by-region with date_trunc), built from the real `TaskList` schema.
|
| 32 |
+
- `validator.py` — `PlannerValidator` running the 8 checks (§7.3); reuses the existing
|
| 33 |
+
`IRValidator` for inline `query_structured` IRs.
|
| 34 |
+
- `service.py` — `PlannerService` + `plan_analysis(...)`: chain (mirrors
|
| 35 |
+
`query/planner/service.py`) + validate-and-retry loop (max 3, mirrors `QueryService`).
|
| 36 |
+
|
| 37 |
+
**Tests added** (`tests/agents/planner/`, 30 passing + 1 gated): `test_schemas.py`,
|
| 38 |
+
`test_inputs.py`, `test_validator.py` (one failure per check + happy paths),
|
| 39 |
+
`test_service.py` (`_FakeChain` + retry), `test_golden_questions.py` (live eval gated on
|
| 40 |
+
`RUN_PLANNER_EVAL=1`). `ruff check` clean on planner paths.
|
| 41 |
+
|
| 42 |
+
**Open follow-ups (not blockers):** reconcile `BusinessContext` with the lead and
|
| 43 |
+
`ToolRegistry`/`ToolSpec` + real tools with teammate (KM-608); "GPT mini" currently uses
|
| 44 |
+
the configured 4o deployment (swap `azure_deployment` when a mini deployment exists). Next
|
| 45 |
+
per the architecture doc: Orchestrator slow-path expansion + TaskRunner + Assembler.
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
## Legend
|
| 50 |
|
| 51 |
- `[x]` done and merged
|