sofhiaazzhr commited on
Commit
0c263cf
Β·
1 Parent(s): f34af71

fix planner prompt template parsing

Browse files
Files changed (1) hide show
  1. PROGRESS.md +2 -2
PROGRESS.md CHANGED
@@ -2,7 +2,7 @@
2
 
3
  Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team β€” division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
4
 
5
- **Last updated**: 2026-05-08 (merged: DB owner's PR2b/4/5/6/7-bundle into TAB's PR1-tab + PR3-TAB on `pr/1`)
6
  **Current open PR**: none β€” all Phase 2 contracts shipped on `pr/1`. Cleanup PR pending (API rewiring + Phase 1 removal).
7
 
8
  ---
@@ -134,7 +134,7 @@ Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "T
134
  | β€” | Catalog store integration test (`tests/catalog/test_store.py`) | DB | `[x]` | PR1 β€” module-level skip without `RUN_INTEGRATION_TESTS=1` |
135
  | β€” | DB introspector test | DB | `[ ]` | Deferred to PR2 β€” needs Postgres testcontainer or fixture infra |
136
  | β€” | Tabular introspector test | TAB | `[x]` | PR1-tab β€” 31 unit tests (CSV/XLSX/Parquet, stats, PII, error paths). No DB/blob I/O β€” mocks injected via constructor. |
137
- | 41 | Planner eval (`tests/query/planner/`) | B | `[~]` | PR6-scaffold β€” `test_golden_questions.py` with 3 DB-targeting cases. Skipped by default; runs against real Azure OpenAI when `RUN_PLANNER_EVAL=1`. TAB extends with tabular-targeting cases once their compiler exists. |
138
  | 42 | E2E smoke tests (`tests/e2e/`) | B | `[ ]` | Defer until Phase 2 endpoints are wired (cleanup PR). Component-level orchestration is already covered by `test_chat_handler.py` + `test_service.py`. |
139
  | β€” | Golden IR fixtures (`tests/fixtures/golden_irs.json`) | B | `[~]` | PR1 seeded with 5 DB-targeting examples; TAB extends in PR1-tab |
140
  | β€” | Shared `sample_catalog` fixture (`tests/conftest.py`) | B | `[x]` | PR1 β€” DB-shaped; TAB may add tabular sibling |
 
2
 
3
  Persistent tracker mirroring the 42-item ownership table in `REPO_CONTEXT.md` "Team β€” division of work". Update as PRs land. Future Claude Code sessions read this to know what's already done.
4
 
5
+ **Last updated**: 2026-05-08 (item 41 done β€” tabular planner eval cases added; production bug fixed in `query/planner/service.py`)
6
  **Current open PR**: none β€” all Phase 2 contracts shipped on `pr/1`. Cleanup PR pending (API rewiring + Phase 1 removal).
7
 
8
  ---
 
134
  | β€” | Catalog store integration test (`tests/catalog/test_store.py`) | DB | `[x]` | PR1 β€” module-level skip without `RUN_INTEGRATION_TESTS=1` |
135
  | β€” | DB introspector test | DB | `[ ]` | Deferred to PR2 β€” needs Postgres testcontainer or fixture infra |
136
  | β€” | Tabular introspector test | TAB | `[x]` | PR1-tab β€” 31 unit tests (CSV/XLSX/Parquet, stats, PII, error paths). No DB/blob I/O β€” mocks injected via constructor. |
137
+ | 41 | Planner eval (`tests/query/planner/`) | B | `[x]` | PR6-scaffold β€” `test_golden_questions.py` with 3 DB-targeting cases. TAB added `test_golden_tabular.py` with 4 tabular cases (group_by+sum, top-N+limit, date range filter, XLSX sheet selection). All 4 passed against real Azure OpenAI. Fix shipped alongside: `query/planner/service.py` replaced `("system", text)` tuple with `SystemMessage` β€” without this, `{...}` in `query_planner.md` was parsed as f-string variables and crashed on every real invocation. |
138
  | 42 | E2E smoke tests (`tests/e2e/`) | B | `[ ]` | Defer until Phase 2 endpoints are wired (cleanup PR). Component-level orchestration is already covered by `test_chat_handler.py` + `test_service.py`. |
139
  | β€” | Golden IR fixtures (`tests/fixtures/golden_irs.json`) | B | `[~]` | PR1 seeded with 5 DB-targeting examples; TAB extends in PR1-tab |
140
  | β€” | Shared `sample_catalog` fixture (`tests/conftest.py`) | B | `[x]` | PR1 β€” DB-shaped; TAB may add tabular sibling |