Commit Β·
0c263cf
1
Parent(s): f34af71
fix planner prompt template parsing
Browse files- PROGRESS.md +2 -2
PROGRESS.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
|
| 3 |
Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team β division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
|
| 4 |
|
| 5 |
-
**Last updated**: 2026-05-08 (
|
| 6 |
**Current open PR**: none β all Phase 2 contracts shipped on `pr/1`. Cleanup PR pending (API rewiring + Phase 1 removal).
|
| 7 |
|
| 8 |
---
|
|
@@ -134,7 +134,7 @@ Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "T
|
|
| 134 |
| β | Catalog store integration test (`tests/catalog/test_store.py`) | DB | `[x]` | PR1 β module-level skip without `RUN_INTEGRATION_TESTS=1` |
|
| 135 |
| β | DB introspector test | DB | `[ ]` | Deferred to PR2 β needs Postgres testcontainer or fixture infra |
|
| 136 |
| β | Tabular introspector test | TAB | `[x]` | PR1-tab β 31 unit tests (CSV/XLSX/Parquet, stats, PII, error paths). No DB/blob I/O β mocks injected via constructor. |
|
| 137 |
-
| 41 | Planner eval (`tests/query/planner/`) | B | `[
|
| 138 |
| 42 | E2E smoke tests (`tests/e2e/`) | B | `[ ]` | Defer until Phase 2 endpoints are wired (cleanup PR). Component-level orchestration is already covered by `test_chat_handler.py` + `test_service.py`. |
|
| 139 |
| β | Golden IR fixtures (`tests/fixtures/golden_irs.json`) | B | `[~]` | PR1 seeded with 5 DB-targeting examples; TAB extends in PR1-tab |
|
| 140 |
| β | Shared `sample_catalog` fixture (`tests/conftest.py`) | B | `[x]` | PR1 β DB-shaped; TAB may add tabular sibling |
|
|
|
|
| 2 |
|
| 3 |
Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team β division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
|
| 4 |
|
| 5 |
+
**Last updated**: 2026-05-08 (item 41 done β tabular planner eval cases added; production bug fixed in `query/planner/service.py`)
|
| 6 |
**Current open PR**: none β all Phase 2 contracts shipped on `pr/1`. Cleanup PR pending (API rewiring + Phase 1 removal).
|
| 7 |
|
| 8 |
---
|
|
|
|
| 134 |
| β | Catalog store integration test (`tests/catalog/test_store.py`) | DB | `[x]` | PR1 β module-level skip without `RUN_INTEGRATION_TESTS=1` |
|
| 135 |
| β | DB introspector test | DB | `[ ]` | Deferred to PR2 β needs Postgres testcontainer or fixture infra |
|
| 136 |
| β | Tabular introspector test | TAB | `[x]` | PR1-tab β 31 unit tests (CSV/XLSX/Parquet, stats, PII, error paths). No DB/blob I/O β mocks injected via constructor. |
|
| 137 |
+
| 41 | Planner eval (`tests/query/planner/`) | B | `[x]` | PR6-scaffold β `test_golden_questions.py` with 3 DB-targeting cases. TAB added `test_golden_tabular.py` with 4 tabular cases (group_by+sum, top-N+limit, date range filter, XLSX sheet selection). All 4 passed against real Azure OpenAI. Fix shipped alongside: `query/planner/service.py` replaced `("system", text)` tuple with `SystemMessage` β without this, `{...}` in `query_planner.md` was parsed as f-string variables and crashed on every real invocation. |
|
| 138 |
| 42 | E2E smoke tests (`tests/e2e/`) | B | `[ ]` | Defer until Phase 2 endpoints are wired (cleanup PR). Component-level orchestration is already covered by `test_chat_handler.py` + `test_service.py`. |
|
| 139 |
| β | Golden IR fixtures (`tests/fixtures/golden_irs.json`) | B | `[~]` | PR1 seeded with 5 DB-targeting examples; TAB extends in PR1-tab |
|
| 140 |
| β | Shared `sample_catalog` fixture (`tests/conftest.py`) | B | `[x]` | PR1 β DB-shaped; TAB may add tabular sibling |
|