Agentic-Service-Data-Eyond-Catalog

Sleeping

App Files Files Community

sofhiaazzhr commited on May 8

Commit

0c263cf

1 Parent(s): f34af71

fix planner prompt template parsing

Browse files

Files changed (1) hide show

PROGRESS.md +2 -2

PROGRESS.md CHANGED Viewed

@@ -2,7 +2,7 @@
 Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team — division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
-**Last updated**: 2026-05-08 (merged: DB owner's PR2b/4/5/6/7-bundle into TAB's PR1-tab + PR3-TAB on `pr/1`)
 **Current open PR**: none — all Phase 2 contracts shipped on `pr/1`. Cleanup PR pending (API rewiring + Phase 1 removal).
 ---
@@ -134,7 +134,7 @@ Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "T
 | — | Catalog store integration test (`tests/catalog/test_store.py`) | DB | `[x]` | PR1 — module-level skip without `RUN_INTEGRATION_TESTS=1` |
 | — | DB introspector test | DB | `[ ]` | Deferred to PR2 — needs Postgres testcontainer or fixture infra |
 | — | Tabular introspector test | TAB | `[x]` | PR1-tab — 31 unit tests (CSV/XLSX/Parquet, stats, PII, error paths). No DB/blob I/O — mocks injected via constructor. |
-| 41 | Planner eval (`tests/query/planner/`) | B | `[~]` | PR6-scaffold — `test_golden_questions.py` with 3 DB-targeting cases. Skipped by default; runs against real Azure OpenAI when `RUN_PLANNER_EVAL=1`. TAB extends with tabular-targeting cases once their compiler exists. |
 | 42 | E2E smoke tests (`tests/e2e/`) | B | `[ ]` | Defer until Phase 2 endpoints are wired (cleanup PR). Component-level orchestration is already covered by `test_chat_handler.py` + `test_service.py`. |
 | — | Golden IR fixtures (`tests/fixtures/golden_irs.json`) | B | `[~]` | PR1 seeded with 5 DB-targeting examples; TAB extends in PR1-tab |
 | — | Shared `sample_catalog` fixture (`tests/conftest.py`) | B | `[x]` | PR1 — DB-shaped; TAB may add tabular sibling |

 Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team — division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
+**Last updated**: 2026-05-08 (item 41 done — tabular planner eval cases added; production bug fixed in `query/planner/service.py`)
 **Current open PR**: none — all Phase 2 contracts shipped on `pr/1`. Cleanup PR pending (API rewiring + Phase 1 removal).
 ---
 | — | Catalog store integration test (`tests/catalog/test_store.py`) | DB | `[x]` | PR1 — module-level skip without `RUN_INTEGRATION_TESTS=1` |
 | — | DB introspector test | DB | `[ ]` | Deferred to PR2 — needs Postgres testcontainer or fixture infra |
 | — | Tabular introspector test | TAB | `[x]` | PR1-tab — 31 unit tests (CSV/XLSX/Parquet, stats, PII, error paths). No DB/blob I/O — mocks injected via constructor. |
+| 41 | Planner eval (`tests/query/planner/`) | B | `[x]` | PR6-scaffold — `test_golden_questions.py` with 3 DB-targeting cases. TAB added `test_golden_tabular.py` with 4 tabular cases (group_by+sum, top-N+limit, date range filter, XLSX sheet selection). All 4 passed against real Azure OpenAI. Fix shipped alongside: `query/planner/service.py` replaced `("system", text)` tuple with `SystemMessage` — without this, `{...}` in `query_planner.md` was parsed as f-string variables and crashed on every real invocation. |
 | 42 | E2E smoke tests (`tests/e2e/`) | B | `[ ]` | Defer until Phase 2 endpoints are wired (cleanup PR). Component-level orchestration is already covered by `test_chat_handler.py` + `test_service.py`. |
 | — | Golden IR fixtures (`tests/fixtures/golden_irs.json`) | B | `[~]` | PR1 seeded with 5 DB-targeting examples; TAB extends in PR1-tab |
 | — | Shared `sample_catalog` fixture (`tests/conftest.py`) | B | `[x]` | PR1 — DB-shaped; TAB may add tabular sibling |