| # Open Tasks |
|
|
| Single source of truth for active work. |
|
|
| ## Priority Legend |
| - P0 = blocking production/demo |
| - P1 = high impact |
| - P2 = nice to have |
|
|
| ## Active |
|
|
| | ID | Priority | Task | Owner | Status | Next Step | Verification | |
| |---|---|---|---|---|---|---| |
| | T-137 | P2 | Keep repo hygiene bounded by pruning non-canonical eval artifacts while preserving canonical eval baselines and runner assets | Engineering | DONE | Reuse `backend/scripts/cleanup_repo_artifacts.py --dry-run --no-backup` periodically instead of letting timestamped eval reruns accumulate in git | `python3 backend/scripts/cleanup_repo_artifacts.py --no-backup` -> `removed_dirs=75`, `removed_files=62`; follow-up `python3 backend/scripts/cleanup_repo_artifacts.py --dry-run --no-backup` -> `0` pending removals; canonical probes for `latest_eval25_guarded_gpt_check`, `latest_eval50_guarded_gpt_check`, `latest_eval6_concept_check`, `release_gate`, `shards10`, and `shards5_eval75` all returned OK | |
| | T-146 | P1 | Add a real Hugging Face canary target so the gated deploy workflow exercises both canary and production lanes instead of production-only | Engineering | DONE | Keep the canary Space config aligned with production except for host-specific `APP_BASE_URL` / `VITE_APP_BASE_URL`, and keep `HF_SPACE_ID_CANARY` populated in GitHub Actions | GitHub Actions secret `HF_SPACE_ID_CANARY` now points at `crazycrazypete/Masters-four-Tab-OpenAI-Canary`; Actions run `22813479490` finished with both `deploy-canary` and `deploy-production` green after the canary host-specific base URL was corrected | |
| | T-147 | P1 | Run the first authenticated hosted smoke pass against the refreshed production build and use that result to close the stale hosted sign-off loop | Engineering | DONE | Reuse the same minimal hosted smoke set after future deploys: auth full-flow, one assistant-family provider query, and one POTS workspace shell check on both canary and production when relevant | Credentialed hosted smoke passed on both production and canary: `cd frontend && npx playwright test e2e/auth.full-flow.spec.ts --reporter=line` -> `1 passed`; `cd frontend && E2E_BASE_URL=https://crazycrazypete-masters-four-tab-openai-canary.hf.space npx playwright test e2e/auth.full-flow.spec.ts --reporter=line` -> `1 passed`; `cd frontend && npx playwright test e2e/pots.provider-coverage.spec.ts --reporter=line` -> `1 passed`; `cd frontend && E2E_BASE_URL=https://crazycrazypete-masters-four-tab-openai-canary.hf.space npx playwright test e2e/pots.provider-coverage.spec.ts --reporter=line` -> `1 passed`; one-off headless POTS workspace smoke passed on both hosts and confirmed the `POTS Project Workspace` shell is live instead of the old stacked page | |
| | D-232 | Removed the duplicate assistant-tab security/CAPTCHA checks from the shared Help + Assist launcher, Unified Knowledgebase, and POTS assistant flows while keeping the Rapid Router order-submit CAPTCHA intact | 2026-03-07 | `backend/app/main.py`, `backend/app/test_knowledgebase_api.py`, `backend/app/test_chat_guidance_api.py`, `frontend/src/components/FloatingRouterHelper.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/PotsAssistant.tsx`, `frontend/src/pages/RapidRouter.tsx`; `cd backend && .venv/bin/python -m pytest -q app/test_knowledgebase_api.py app/test_chat_guidance_api.py app/test_rapid_router_api_shell.py` -> `37 passed`; `cd frontend && npm run build` -> success; helper Vitest file remains blocked by local worker hang after startup | |
| | T-144 | P1 | Remove duplicate per-tab assistant security checks from the shared help/assistant tabs while preserving the Rapid Router order-submit CAPTCHA | Engineering | DONE | If assistant abuse appears later, add rate-limit/abuse controls at the backend instead of restoring per-tab CAPTCHA friction | `cd backend && .venv/bin/python -m pytest -q app/test_knowledgebase_api.py app/test_chat_guidance_api.py app/test_rapid_router_api_shell.py` -> `37 passed`; `cd frontend && npm run build` -> success; `cd frontend && npx vitest run src/components/FloatingRouterHelper.test.tsx --reporter=dot` still stalls after startup in the current local Vitest worker environment | |
| | T-143 | P1 | Make Rapid Router advanced configuration notes optional whenever at least one advanced checkbox option is selected, while keeping notes required for freeform advanced requests with no selected task | Engineering | DONE | Keep backend/frontend validation aligned if new advanced task checkboxes are added later | `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `53 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `6 passed`; `cd frontend && npm run build` -> success | |
| | T-142 | P1 | Add the four new required Rapid Router approval attestations and enforce them server-side in the order submit path | Engineering | DONE | Keep any future approval-copy changes mirrored in both frontend validation and backend `approvals` schema so order-submit rules cannot drift | `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `53 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success | |
| | D-231 | Changed Rapid Router BoBo bill-to phone to a full 10-digit US phone field with `(111) 222-2222` formatting, matching frontend validation, backend normalization, and rendered PDF/email output | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `52 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success | |
| | T-141 | P1 | Align Rapid Router BoBo bill-to phone UX and validation with the requested full-phone example `(111) 222-2222` instead of the legacy 7-digit local-number rule | Engineering | DONE | Keep BoBo bill-to phone on the same full 10-digit validation/rendering path as other US phone fields unless the business later confirms a local-only requirement | `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `52 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success | |
| | D-230 | Rapid Router split shipping now clamps per-location assignment to ordered quantity, disables adding more locations once all units are assigned, and persists the optional `Configure IP passthrough` advanced task through backend normalization and output rendering | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `52 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success | |
| | T-140 | P1 | Keep Rapid Router split-shipping allocations bounded to total cart quantity and add the optional advanced `Configure IP passthrough` task through submit/output paths | Engineering | DONE | If shipping allocation rules expand later, keep clamping at edit-time instead of allowing invalid temporary over-assignment states | `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `52 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success | |
| | T-138 | P1 | Close the last local continuity gap on the former guarded-GPT Masters mention-lookup tail (`31`, `32`, `35`, `37`) and then validate that improvement in the next broader rerun | Engineering | DONE | Keep `_should_skip_masters_concept_preflight(...)` aligned with the explicit doc-lookup vocabulary so plural forms like `documents`, `docs`, `files`, and `sources` never reactivate concept fallback for file-title lookups | Focused slice stayed fast: `python3 backend/scripts/unified_kb_eval150.py --cases /tmp/mtk_focused_eval_slices/masters_slice_cases.json ...` -> `4/4 passed`, avg `7.13ms`, p95 `26.42ms`; exact `.env.codex` repros dropped from multi-second latency to fast-path timings (`31`: ~`2461ms` -> `28.83ms`, `32`: ~`2635ms` -> `26.95ms`); broader `31-40` rerun -> `10/10 passed`, avg `5.12ms`, Masters avg `4.19ms`, p95 `26.91ms` | |
| | T-139 | P1 | Finish trimming the remaining reusable router compare/render tail so the broader guarded suites can be rerun against a materially cleaner router latency baseline | Engineering | DONE | Keep the shared-query compare/antenna pattern in place unless a future rerun shows router regressions again; router is no longer the first broad-suite blocker | Focused slice after the shared-query pass: `python3 backend/scripts/unified_kb_eval150.py --cases /tmp/mtk_focused_eval_slices/router_slice_cases.json ...` -> `7/7 passed`, avg `145.25ms`, router-doc avg `328.76ms`, p95 `627.21ms`; broader `75` rerun in `docs/evals/20260308_guarded75_after_masters_fix/` kept router compare prompts materially lower (`42` ~`327ms`, `114` ~`661ms`) while overall suite finished `75/75`, `avg_latency_ms=28.81`, `p95_ms=55.15` | |
| | T-145 | P1 | Re-run the focused Rapid Router frontend validation chain once the local Codex unified-exec saturation is cleared so the new browse-first + BoBo/authorization flow has a clean frontend automated verification summary | Engineering | DONE | Keep future focused frontend reruns isolated so local exec saturation does not masquerade as product regressions | `cd backend && .venv/bin/python -m pytest -x -vv app/rapid_router/test_rapid_router_core.py` -> `28 passed`; `cd backend && .venv/bin/python -m pytest -q app/test_rapid_router_api_shell.py` -> `24 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `3 passed`; `cd frontend && npm run build` -> success | |
| | T-137 | P1 | Eliminate the remaining guarded-GPT delegate latency tails now that the former Masters mention bucket and the main router compare bucket are no longer the broad-suite blocker | Engineering | IN PROGRESS | Chase the new dominant tails next: POTS playbook/provider prompts (`79/82/86`) first, then the smaller longer-form Masters content-pack cluster (`97/99/101`, `106/111/134`) plus the lingering BuSS-SKU doc prompt (`30`) | `docs/evals/20260308_guarded75_after_masters_fix/unified_kb_eval150_shards10_summary.json` -> `75/75`, `avg_latency_ms=28.81`, `p95_ms=55.15`, `p99_ms=327.53`, `stage_budget_exits=0`; `docs/evals/20260308_guarded150_after_masters_fix/unified_kb_eval150_shards10_summary.json` -> `150/150`, `avg_latency_ms=151.14`, `p95_ms=661.36`, `p99_ms=2969.52`, `stage_budget_exits=0`; remaining dominant outliers are `79/82/86` at ~`2.88s-4.36s`, with secondary longer-form Masters prompts `97/99/101`, `106/111/134` and case `30` (~`725ms`) | |
| | T-133 | P1 | Trim the residual deterministic/content-assembly latency that remains after the Masters mention and router compare fixes | Engineering | OPEN | Start with the old POTS playbook/provider bucket (`79/82/86`), then profile the longer-form Masters content-pack prompts before paying for another promotion-quality baseline rerun | `docs/evals/20260308_guarded150_after_masters_fix/unified_kb_eval150_shards10_summary.json` -> `150/150`, `p95_ms=661.36`, `p99_ms=2969.52`; dominant remaining tails: `79` `2879.88ms`, `82` `2969.52ms`, `86` `4359.05ms`; secondary longer-form Masters/content prompts: `97` `334.73ms`, `99` `639.50ms`, `101` `661.69ms`, `106` `669.12ms`, `111` `315.54ms`, `134` `773.01ms` | |
| | T-132 | P1 | Rerun broader guarded-GPT `75` and `150` evals against the new deterministic fast answers, exact/current blocked-case net, and the now-green `50`-case concept pack | Engineering | DONE | Use the rerun artifacts as comparison evidence, but keep `25`/`50` as the stable lightweight gates until the new `75`/`150` regressions in `T-133` are fixed | `docs/evals/20260307_010031_eval75_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `75 / 75 passed`; `docs/evals/20260307_010031_eval150_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `149 / 150 passed`; `bash backend/scripts/test_backend.sh --full` remained previously green at `501 passed` | |
| | T-131 | P1 | Confirm hosted deployment env does not override the new `gpt-5-mini` repo defaults with stale `OPENAI_MODEL` or assistant-specific model pins | Engineering | OPEN | Inspect Hugging Face Space secrets/variables and any production deployment env to ensure `OPENAI_MODEL`, `UNIFIED_KB_OPENAI_MODEL`, and `ROUTER_RAG_OPENAI_MODEL` are unset or explicitly `gpt-5-mini`, then rerun one live assistant smoke check | Local repo/runtime verification is complete, but hosted env values are external to git and can still override repo defaults | |
| | T-130 | P1 | Standardize all active LLM-assisted backend/runtime defaults, env examples, and local repo env pins on `gpt-5-mini` | Engineering | DONE | Treat `gpt-5-mini` as the current default baseline and only revisit if a future model migration is deliberate and fully tested across backend + eval runners | `rg -n 'gpt-5\\.2' README.md backend/app backend/scripts backend/.env.test.example .env.codex backend/.env.codex ...` -> no active runtime/config hits; `cd backend && .venv/bin/python -m pytest -q app/test_pots_conversation_regression.py -k 'concept_fallback_for_generic_pots_question or llm_synthesis_omits_temperature_for_gpt5_models'` -> `2 passed`; `bash backend/scripts/test_backend.sh --full` -> `478 passed`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `31 files / 111 passed`; `docs/evals/20260306_230403_eval25_gpt5mini_default/unified_kb_eval150_shards10_summary.json` -> `25 / 25 passed` | |
| | T-129 | P2 | Stabilize the residual guarded-GPT dual-pathway explainer and rerun the reusable 25-case pack | Engineering | DONE | Keep the dual-pathway phrasing under regression watch as future concept-fallback work lands; no additional action is required for the current acceptance gate | `docs/evals/20260306_230403_eval25_gpt5mini_default/unified_kb_eval150_shards10_summary.json` -> `25 / 25 passed`, `failed_ids=[]` | |
| | T-128 | P1 | Create a reusable guarded-GPT acceptance pack with 25 questions split into 5-question shards and a dedicated shard runner | Engineering | DONE | Use this pack as the lightweight regression gate for future guarded-GPT changes; only expand it after stabilizing or replacing residual case `13` | `python3 - <<'PY' ... len(rows)==25 ... PY` -> success; `bash -n backend/scripts/run_unified_kb_eval25_guarded_gpt_chunks.sh` -> success; `docs/evals/20260307_001201_eval25_phase12/unified_kb_eval150_shards10_summary.json` -> `25 / 25 passed` (`100.0%`) | |
| | T-127 | P1 | Roll out the shared assistant-family concept fallback chain with allow/deny gates, `gpt-5-mini`, provenance labels, and GPT+web only when the model-only concept answer still needs refinement | Engineering | DONE | Expand deterministic internal concept fast answers for the highest-frequency telecom/router/POTS explainers so the new fallback is used less often for questions that can be answered cheaply from curated internal patterns | `cd backend && .venv/bin/python -m pytest -q app/test_assistant_fallback.py app/test_unified_kb_core.py app/test_router_rag_module.py app/test_masters_conversation_regression.py app/test_pots_conversation_regression.py app/test_chat_guidance_api.py app/test_knowledgebase_api.py` -> `202 passed`; `bash backend/scripts/test_backend.sh --full` -> `477 passed`; Router RAG smoke -> `10 passed`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `31 files / 111 passed`; `docs/evals/latest_eval6_concept_check/unified_kb_eval150_shards10_summary.json` -> `6 / 6 passed` | |
| | T-126 | P1 | Redeploy the hosted Hugging Face app and rerun the live POTS provider-coverage Playwright spec so the local `MetTel` provider-card backfill is validated against the actual live site | Engineering | OPEN | Ship the current backend provider-card patch, wait for Hugging Face to rebuild, then rerun `cd frontend && npx playwright test e2e/pots.provider-coverage.spec.ts --config=playwright.config.ts` against the hosted base URL | Local fix coverage is green: `cd backend && .venv/bin/python -m pytest -q app/test_unified_kb_core.py -k 'provider_inventory_supplements_missing_pots_provider_cards_from_router_corpus or provider_inventory_backfills_missing_router_hint_paths_from_index_hits'` -> `2 passed`; `cd backend && .venv/bin/python -m pytest -q app/test_pots_provider_recall.py` -> `2 passed`; live suite remains `9 passed / 1 failed / 4 skipped` until redeploy | |
| | T-125 | P1 | Enforce the current UI-lock scan rules by removing dead collapsed banners, hiding default status/debug entry points, and eliminating duplicate primary CTAs where they still leak into the active viewport | Engineering | DONE | Keep future cleanup focused on dense admin-only surfaces and message-detail consistency, not reopening already-compliant shell patterns | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/AssistantWorkspace.test.tsx src/components/PromptCoach.test.tsx src/components/BrandHeader.test.tsx src/pages/RapidRouter.test.tsx --reporter=dot` -> `11 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `30 files / 105 passed`; `git diff --check` -> success | |
| | T-124 | P1 | Lock the knowledge/chat family to one shared assistant shell, auto-collapse setup after the first user turn, and restyle the legacy assistant pages onto that pattern | Engineering | DONE | If assistant cleanup continues, unify the deeper response-detail treatments (`Why`, `Next action`, `Sources`, file panels) so all assistant answers share one internal message pattern as well | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/AssistantWorkspace.test.tsx src/components/PageArchetypes.test.tsx --reporter=dot` -> `4 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `30 files / 105 passed` | |
| | T-123 | P1 | Rebuild `RapidRouter` as a staged scan-and-build commerce flow (`Filter`, `Browse`, `Quantity`, `Customer info`, `Review`) with a sticky cart and collapsed secondary tools | Engineering | DONE | Collapse any remaining late-stage advanced/admin helper clusters behind one secondary control so the new commerce sequence stays clean under heavy use | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `2 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `29 files / 103 passed` | |
| | T-122 | P1 | Collapse Telco assumptions, what-if mode, diagnostics, quote helpers, scenario JSON/CSV, and assistant coaching into one shared `Advanced` drawer so the default calculator surface stays on the business flow | Engineering | DONE | Apply the same single-secondary-control rule to `RapidRouter`, which still exposes too many business and support surfaces in parallel | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/TelcoCalculator.test.tsx --reporter=dot` -> `2 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `28 files / 101 passed` | |
| | T-121 | P1 | Rebuild `TelcoCalculator` as a single-path step sequence with `Locations`, `Pricing`, `Results`, and `Export` instead of a simultaneous tri-column calculator surface | Engineering | DONE | Apply the same step-led simplification to `RapidRouter`, which still mixes catalog, helper, and order-prep surfaces in one view | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/TelcoCalculator.test.tsx --reporter=dot` -> `1 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `28 files / 100 passed` | |
| | T-120 | P1 | Replace paragraph-style POTS instructions with a stable three-line step guide so each step only says what it does, what is needed now, and what happens next | Engineering | DONE | Reuse the same guide pattern in the POTS project drawer and any later summary/export surfaces that still read as prose-heavy | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsEstimateIntake.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `23 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 99 passed` | |
| | T-119 | P1 | Flatten the embedded `PotsEstimateIntake` wrapper so the estimator and intake inherit a lighter host shell instead of stacking full cards inside cards | Engineering | DONE | If more simplification is still needed, flatten the later review/export sections inside `PotsIntake` rather than adding more wrapper-level chrome | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsEstimateIntake.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `23 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 99 passed` | |
| | T-118 | P1 | Convert `PotsWorkspace` routing questions into a one-question-at-a-time conversation with answer cards and compact `Why this matters` disclosure | Engineering | DONE | Reuse the same guided-question pattern in other dense decision forms if later UI lock passes show similar cognitive overload | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `10 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 99 passed` | |
| | T-117 | P1 | Move active-project creation and saved-project switching behind the `Project tools` drawer so `PotsWorkspace` stops showing setup UI in the main wizard by default | Engineering | DONE | Apply the same hide-setup-behind-drawer rule in other dense workflows such as `RapidRouter` and `TelcoCalculator` where setup/admin surfaces still compete with the active task | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `9 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 98 passed` | |
| | T-116 | P1 | Convert `PotsWorkspace` into a true wizard shell with one active step and one optional utility drawer | Engineering | DONE | Apply the same step-led shell discipline to `RapidRouter` and `TelcoCalculator` if the UI lock continues beyond POTS | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `8 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; `git diff --check` -> success | |
| | T-115 | P1 | Enforce one obvious primary action per screen so setup, reset, export, and support controls stop competing with the current forward move | Engineering | DONE | Continue the CTA-hierarchy pass in `RapidRouter`, `TelcoCalculator`, and the assistant-family export/help clusters where multiple strong actions still share one viewport | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsEstimateIntake.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `16 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; `git diff --check` -> success | |
| | T-114 | P1 | Tighten and standardize the radius system so major shells use 20px, secondary surfaces use 16px, controls use 12px, and full-pill styling is reserved for chips | Engineering | DONE | Continue the same radius cleanup through `RapidRouter`, `TelcoCalculator`, `CommandPalette`, and any remaining legacy modal/helper surfaces that still overuse `rounded-2xl` | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/PrimaryNavigation.test.tsx src/components/FloatingRouterHelper.test.tsx src/components/PromptCoach.test.tsx src/components/chat/ChatTranscript.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx src/pages/PotsIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `34 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; `git diff --check` -> success | |
| | T-113 | P1 | Replace border-heavy card stacking with the locked three-surface whitespace hierarchy in the shared shell and active POTS flow | Engineering | DONE | Continue the same whitespace-hierarchy cleanup in `TelcoCalculator`, `RapidRouter`, and the still-denser late-step surfaces in `PotsIntake` if the UI lock continues | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/PageArchetypes.test.tsx src/pages/PotsWorkspace.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx src/pages/PotsIntake.test.tsx --reporter=dot` -> `23 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; `git diff --check` -> success | |
| | T-112 | P1 | Reduce badge and label noise in the shared shell, POTS flow, and assistant-family pages so metadata stops competing with primary actions | Engineering | DONE | If the badge-noise pass continues, target the remaining denser local-state surfaces next: `TelcoCalculator`, `RapidRouter`, and the still-busy parts of `PotsIntake` | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/FloatingRouterHelper.test.tsx src/components/PageArchetypes.test.tsx src/pages/PotsWorkspace.test.tsx src/pages/PotsSavingsEstimator.test.tsx --reporter=dot` -> `19 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed` | |
| | T-111 | P1 | Lock the shared typography system so the shell uses Public Sans, a slightly larger reading scale, and uppercase only for true metadata | Engineering | DONE | Continue the UI lock by applying the same shared typography utilities opportunistically to any remaining dense admin/reporting surfaces as later layout passes touch them | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/BrandHeader.test.tsx src/components/PrimaryNavigation.test.tsx src/components/PageArchetypes.test.tsx src/components/FloatingRouterHelper.test.tsx src/pages/PotsWorkspace.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx src/pages/PotsIntake.test.tsx --reporter=dot` -> `36 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; typography-scope grep confirmed no remaining `uppercase` classes in the shared shell + active assistant/POTS lock scope | |
| | T-110 | P1 | Lock the shared UI color system so navy is primary, slate is structural, green is success/live, amber is caution, and red is reserved for destructive/error emphasis | Engineering | DONE | Continue the UI lock by converting any remaining page-local legacy color classes onto the shared token system as the next visual recommendations are implemented | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/BrandHeader.test.tsx src/components/PrimaryNavigation.test.tsx src/components/PageArchetypes.test.tsx src/components/FloatingRouterHelper.test.tsx --reporter=dot` -> `15 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; shared shell grep confirmed no remaining `#EE0000`, `#1a2b56`, `#243869`, or old blue helper classes in `frontend/src/components` / `frontend/src/index.css` | |
| | T-109 | P1 | Define and apply four shared page archetype shells (`Workspace`, `Calculator`, `Catalog`, `Assistant`) so the main tabs stop mixing layout patterns | Engineering | DONE | Extend the same shared shells to the remaining assistant-class tabs (`RouterKnowledgebase`, `RoutersAssistant`, `MastersAI`, `PotsAssistant`) and then decide whether any shell-specific cleanup is still needed per page | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/PageArchetypes.test.tsx src/components/BrandHeader.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `15 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; local desktop/mobile browser spot-check confirmed POTS, Telco, Rapid Router, and Knowledgebase render the expected archetype shells | |
| | T-108 | P1 | Consolidate floating support and helper controls into one shared help launcher with internal tabs | Engineering | DONE | Continue the UI lock by reviewing any remaining duplicated top-level utility affordances and deciding whether command-palette/status visibility should stay as-is or be simplified further | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/FloatingRouterHelper.test.tsx src/components/PrimaryNavigation.test.tsx src/components/BrandHeader.test.tsx --reporter=dot` -> `12 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `26 files / 94 passed`; local desktop/mobile browser spot-check confirmed one floating `Help` launcher with `Assist` and `Support` tabs | |
| | T-107 | P1 | Remove emoji-style workspace cues and standardize the shell on restrained enterprise navigation icons | Engineering | DONE | Continue the UI lock by consolidating the duplicate floating global launchers so the cleaner shell is not undercut by competing bottom-of-screen controls | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/PrimaryNavigation.test.tsx src/components/BrandHeader.test.tsx --reporter=dot` -> `8 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `25 files / 90 passed`; local desktop/mobile browser spot-check confirmed the workspace shell no longer exposes `🧠 📚 🧮 📉 📡 ⚡` text | |
| | T-106 | P1 | Replace the old toolbox interaction with a true primary navigation system: visible desktop workspace rail, mobile workspace sheet, and integrated workspace search | Engineering | DONE | Continue the UI lock by simplifying the remaining persistent global controls, starting with the duplicate bottom launchers (`Get support`, `Open router helper`) | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/PrimaryNavigation.test.tsx src/components/BrandHeader.test.tsx --reporter=dot` -> `7 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `25 files / 89 passed`; local desktop/mobile browser spot-check confirmed the old `Open toolbox` / `Toolbox is collapsed` copy is gone | |
| | T-105 | P1 | Collapse the global shell into one compact utility header and remove always-visible toolbox chrome | Engineering | DONE | Treat the compact header as the locked baseline for the new primary navigation shell; no separate follow-up remains beyond the broader UI lock work | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/BrandHeader.test.tsx --reporter=dot` -> `4 passed`; `cd frontend && npm run build` -> success | |
| | T-104 | P1 | Redeploy the Hugging Face hosted frontend so hosted POTS QA runs against the latest simplified workspace/intake implementation instead of the stale stacked build | Engineering | IN_PROGRESS | Trigger/reconfirm the Space rebuild, then rerun the hosted POTS desktop/mobile sign-off pass against the refreshed deployment | Hosted/Auth0 check on 2026-03-06: `cd frontend && npx playwright test e2e/auth.full-flow.spec.ts --config=playwright.config.ts --reporter=line` -> `1 passed`; hosted POTS desktop/mobile inspection -> `0/2 sign-off passes` because both viewports still rendered the older stacked POTS workspace with `POTS Project Workspace`, `POTS Estimates + Intake`, and `POTS Savings Estimator` on one page | |
| | T-103 | P1 | Sweep remaining frontend destructive actions so saved drafts, chat resets, and scoped removals all require confirmation before data loss | Engineering | DONE | If final sign-off needs it, spot-check representative destructive flows in the hosted/authenticated runtime; local/frontend verification is complete | `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/utils/chatCommands.test.ts src/utils/confirmAction.test.ts src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx src/pages/PotsIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `27 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `24 files / 86 passed` | |
| | T-102 | P1 | Simplify `PotsWorkspace` into a progressive, one-step-at-a-time surface with collapsed secondary tools | Engineering | DONE | If final deploy sign-off needs it, repeat the same flow against the hosted/authenticated runtime; no more default-open changes are pending from the local browser pass | `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `7 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `23 files / 79 passed`; local browser QA at `1440x1024` + `390x844` confirmed the support panels now behave as a true accordion | |
| | T-101 | P1 | Simplify the active POTS estimate/intake UX with progressive disclosure and fewer always-open support panels | Engineering | DONE | Treat hosted/authenticated browser QA as optional final sign-off only; local browser QA did not justify opening any additional intake disclosures by default | `cd frontend && npx vitest run src/pages/PotsIntake.test.tsx src/pages/PotsEstimateIntake.test.tsx --reporter=dot` -> `6 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `23 files / 79 passed`; local browser QA at `1440x1024` + `390x844` confirmed `See all sites`, optional notes, and helper disclosures can stay closed by default | |
| | T-100 | P1 | Clarify POTS estimator start-path choices and seed intake according to the chosen entry mode | Engineering | DONE | Run hosted/manual QA on all three chooser paths (`quick estimate`, `totals now, site details next`, `site-by-site now`) and confirm the seeded intake drafts feel obvious under real auth/runtime conditions | `cd frontend && npx vitest run src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `22 files / 72 passed` | |
| | T-099 | P1 | Add clear project deletion flow to POTS workspace with confirmation pop-up | Engineering | DONE | Use the selector delete action during hosted/manual QA and note any copy/layout polish or additional destructive-action confirmation gaps elsewhere in the app | `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `20 files / 67 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `46 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `51 passed` | |
| | T-098 | P1 | Expose phase-9-24 POTS workspace workflow controls in the SPA for manual/hosted verification | Engineering | DONE | Use the new workflow panel for credentialed hosted/browser QA, export review, and responsive checks; continue tracking any remaining phase-25+ surface gaps separately | `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `3 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `20 files / 65 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `45 passed` | |
| | T-097 | P1 | Deep-dive Phase 9-40 workflow logic for gotchas and add detailed edge-case regression coverage | Engineering | DONE | Continue hosted manual UX verification for phase-9+ workflow controls and export review with pilot users | `python3 -m pytest -q backend/app/test_pots_workspace_api.py -k \"remove_last_location_resets_project_counts or excel_export_has_required_tabs\"` -> `2 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `50 passed`; `cd backend && python3 -m pytest -q` -> `435 passed`; `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 62 passed` | |
| | T-096 | P1 | Execute Phases 9-40 of POTS workspace roadmap in strict order (guided discovery through launch optimization) | Engineering | DONE | Core backend roadmap and initial SPA workflow surface are complete; next step is hosted/manual UX pass plus credentialed browser journeys using the new `PotsWorkspace` controls | Per-phase gate: `for p in $(seq 9 40); do python3 -m pytest -q backend/app/test_pots_workspace_api.py -k \"phase${p}\"; done` -> each selector `1 passed`; consolidated: `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `43 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `48 passed`; `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 62 passed` | |
| | T-095 | P1 | Execute Phase 8 of 40-phase POTS roadmap: audit log v1 | Engineering | DONE | Start Phase 9 guided-discovery question tree implementation | `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `11 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `16 passed` | |
| | T-094 | P1 | Execute Phase 7 of 40-phase POTS roadmap: delegation skeleton (internal section assignment) | Engineering | DONE | Start Phase 8 audit log v1 for immutable activity timeline | `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `10 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `15 passed` | |
| | T-093 | P1 | Execute Phase 6 of 40-phase POTS roadmap: intake progress model and completion scoring | Engineering | DONE | Start Phase 7 delegation skeleton (internal assignment ownership) | `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `9 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `14 passed` | |
| | T-092 | P1 | Execute Phase 5 of 40-phase POTS roadmap: workspace-home UX (mode-first start + next action guidance) | Engineering | IN_REVIEW | Complete manual desktop/tablet/mobile QA checklist in `docs/dev/pots_workspace_phase5_home_ux.md` and close remaining layout nits | `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 62 passed` | |
| | T-091 | P1 | Execute Phase 4 of 40-phase POTS roadmap: tenant/user isolation hardening | Engineering | DONE | Start Phase 5 workspace-home UX refinement and next-action card design | `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `8 passed`; `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `13 passed` | |
| | T-090 | P1 | Execute Phase 3 of 40-phase POTS roadmap: lifecycle state machine with guarded transitions | Engineering | DONE | Start Phase 4 tenant/user isolation hardening and fallback handling checks | `python3 -m pytest -q backend/app/test_pots_workspace_api.py` -> `7 passed`; `python3 -m pytest -q backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `5 passed` | |
| | T-089 | P1 | Execute Phase 2 of 40-phase POTS roadmap: role/collaboration model (internal-first) | Engineering | DONE | Use published role matrix to drive delegation/audit implementation phases; external customer contribution remains deferred | `rg -n "Role Matrix|Collaboration Boundaries|External Contribution Decision|\\[x\\]" docs/dev/pots_workspace_phase2_roles_collaboration.md` -> required sections/checklist confirmed | |
| | T-088 | P1 | Execute Phase 1 of the new 40-phase POTS workspace roadmap (scoped project API + triage + workspace shell) | Engineering | DONE | Begin Phase 2 role/collaboration model and external-contribution boundary decisioning (kept deferred from Phase 1 per user direction) | `python3 -m pytest -q backend/app/test_pots_workspace_api.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `9 passed`; `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 62 passed` | |
| | T-087 | P1 | Publish detailed 40-phase project map for enterprise POTS workspace execution | Engineering | DONE | Use this map as execution baseline; track phase-by-phase completion status in this task table and session handoff | `docs/dev/pots_workspace_40_phase_project_map.md` created with phase definitions, verification gates, and exit criteria | |
| | T-086 | P1 | Execute saved cross-workstream gameplan for remaining fixes/enhancements (phased next-thread plan) | Engineering | IN_PROGRESS | Keep parser backlog item deferred; auth blocker is cleared; next focus is broader hosted/manual sign-off items plus optional `150` stability push from `94.7%` to `>=95%` | Phase 0 auth refresh: `cd frontend && npx vitest run src/auth/config.test.ts src/auth/errorUtils.test.ts src/components/HealthStatusModal.test.tsx` -> `16 passed`; `python3 -m pytest -q backend/app/test_auth.py backend/app/test_startup_rate_limit.py` -> `31 passed`; `cd frontend && npx playwright test e2e/auth.spec.ts --reporter=line` -> `6 passed`; `cd frontend && npx playwright test e2e/auth.full-flow.spec.ts --reporter=line` -> `1 passed`; Phase 1 gate run: frontend build success + frontend tests `59 passed` + Rapid Router/API pytest `49 passed`; Phase 2 gate re-run: frontend build success + frontend tests `59 passed` + consolidation pytest `68 passed`; Phase 3 gate run: `150` `142/150` (`94.7%`) failed `[24,36,88,98,99,104,112,129]` (`docs/evals/20260305T013817_phase3_gate150_final/unified_kb_eval150_shards10_summary.json`), `75` `74/75` (`98.7%`) failed `[3]` (`docs/evals/20260305T015614_phase3_gate75_final/unified_kb_eval150_shards10_summary.json`), `50` `50/50` (`100.0%`) failed `[]` (`docs/evals/20260305T020530_phase3_gate50_final/unified_kb_eval150_shards10_summary.json`); additional `150` target attempt `141/150` (`94.0%`) failed `[48,55,78,89,99,107,110,112,118]` (`docs/evals/20260305T021154_phase3_gate150_rerun2_final/unified_kb_eval150_shards10_summary.json`); Phase 4 gate run: `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `151 passed`; Phase 5 targeted runs: `cd backend && python3 -m pytest -q app/test_unified_kb_core.py app/test_pots_conversation_regression.py app/test_unified_kb_eval150_script.py` -> `102 passed` | |
| | T-085 | P1 | Validate new Smart Profile + Customer Memory + resume/repeat flows across KB, POTS, and Rapid Router in hosted runtime | Engineering | IN_REVIEW | Seed `frontend/.env.e2e` or `frontend/.env.e2e.local` with two Auth0 test users, then run `cd frontend && npx playwright test e2e/rapid-router.memory-isolation.spec.ts` plus the manual same-browser two-user swap: (1) save/apply Rapid Router profile as user A, (2) log out/in as user B and confirm no customer-profile carryover appears, (3) switch back to user A and confirm scoped memory is still present, (4) repeat KB/POTS handoff checks | `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/utils/customerMemory.test.ts --pool=threads --maxWorkers=1` -> `4 passed`; `cd frontend && npx playwright test e2e/rapid-router.memory-isolation.spec.ts --list` -> `1 test listed`; `cd frontend && npx vitest run src/components/BrandHeader.test.tsx --pool=threads --maxWorkers=1` -> `4 passed` | |
| | T-084 | P1 | Validate header Slack-chip responsiveness and spacing with command/status toggles on narrow widths | Engineering | IN_REVIEW | Manually check header controls at mobile/tablet/desktop and with command-palette/system-status hidden; ensure Slack chip remains accessible without wrapping collisions | `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/components/BrandHeader.test.tsx --pool=threads --maxWorkers=1` -> `4 passed` | |
| | T-083 | P1 | Validate global floating support launcher UX in hosted runtime (desktop + mobile overlap with router helper) | Engineering | IN_REVIEW | Capture hosted screenshots and confirm Slack/email/phone links open correctly from each tab; tune spacing/z-index if mobile overlaps with bottom-page controls | `npm --prefix frontend run build` -> success; manual hosted check from all enabled tabs | |
| | T-082 | P1 | Validate hosted UX for new Rapid Router split-shipping flow (single-model orders only) across desktop/tablet/mobile | Engineering | IN_REVIEW | Run manual hosted pass covering: default single-address flow, enabling split shipping on one selected model, cap enforcement (`locations <= qty`), and mixed-model disabled state; capture screenshots and any copy/layout nits | `npm --prefix frontend run build` -> success; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `25 passed`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py` -> `24 passed` | |
| | T-080 | P0 | Remove legacy `masters-toolkit-api` audience assumptions from auth-required deployments and verify hosted login without a custom API audience | Engineering | DONE | Legacy audience placeholder is now ignored in active auth code and hosted login passed with credentialed Playwright runs; keep deployment env clean by leaving `VITE_AUTH0_AUDIENCE` / `AUTH0_AUDIENCE` unset unless a real Auth0 API Identifier is introduced later | `cd frontend && npx vitest run src/auth/config.test.ts src/auth/errorUtils.test.ts src/components/HealthStatusModal.test.tsx` -> `16 passed`; `python3 -m pytest -q backend/app/test_auth.py backend/app/test_startup_rate_limit.py` -> `31 passed`; `npm --prefix frontend run build` -> success; `cd frontend && npx playwright test e2e/auth.spec.ts --reporter=line` -> `6 passed`; `cd frontend && npx playwright test e2e/auth.full-flow.spec.ts --reporter=line` -> `1 passed` | |
| | T-081 | P1 | Fill missing Crown (`ASKNCM1100E`) WAN/LAN detail fields in deterministic router fact CSV for cleaner Dragon-vs-Crown compares | Engineering | DONE | Added source-backed Crown interface counts to deterministic CSV and covered fast-path behavior with regression assertions | `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `151 passed` | |
| | T-079 | P1 | Recover mixed-domain shard regressions from OpenAI 2026-02-27 run (`150` suite failed IDs) | Engineering | IN_REVIEW | Latest best Phase-3 `150` gate remains `142/150` (`94.7%`) with failed IDs `[24,36,88,98,99,104,112,129]`; additional rerun showed semantic variance (`141/150`, failed `[48,55,78,89,99,107,110,112,118]`); focus next on masters/pots long-form semantic stability to hold `>=95%` target | Re-run `cd backend && CHUNK_SIZE=15 START_ID=1 END_ID=150 SEMANTIC_POLICY=all OUT_DIR=../docs/evals/<stamp> CASES_PATH=../docs/evals/unified_kb_eval150_cases.json ./scripts/run_unified_kb_eval150_chunks.sh`; artifacts: `docs/evals/20260305T013817_phase3_gate150_final/` and `docs/evals/20260305T021154_phase3_gate150_rerun2_final/`; maintain `>=92%`, target `>=95%` | |
| | T-078 | P1 | Raise router-helper quality for generated 50-question conceptual set (currently `23/50`) | Engineering | DONE | Completed targeted conceptual-intent/routing fixes in `backend/app/knowledgebase/core.py`; raised generated-50 suite from `23/50` to `47/50` (`94.0%`) with no stage-budget exits in latest run | `cd backend && CHUNK_SIZE=5 START_ID=1 END_ID=50 SEMANTIC_POLICY=all OUT_DIR=../docs/evals/shards10_eval50_openai_all_20260227_fix7_full CASES_PATH=../docs/evals/unified_kb_eval50_new_questions_router_helper_cases.json ./scripts/run_unified_kb_eval150_chunks.sh` | |
| | T-077 | P1 | Consolidate Routers tab capabilities into Master’s Telecom AI Knowledgebase as single source tab (no duplicate tool surfaces) | Engineering | IN_REVIEW | Hosted parity sign-off remains: capture credentialed runtime proof for KB-first router journey and final tab-retirement readiness notes | `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 59 passed`; `python3 -m pytest -q backend/app/test_knowledgebase_api.py backend/app/routers/router_tab_smoke_test.py backend/app/test_tab_final_pass_matrix.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `68 passed` | |
| | T-076 | P1 | Merge `POTS Savings Estimator` + `POTS Replacement Intake` into one guided tab with estimator-to-intake handoff | Engineering | IN_REVIEW | Hosted guided-flow sign-off remains: run credentialed journey for estimator->intake carryover and confirm user-facing prefill clarity | `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 59 passed`; `python3 -m pytest -q backend/app/test_knowledgebase_api.py backend/app/routers/router_tab_smoke_test.py backend/app/test_tab_final_pass_matrix.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `68 passed` | |
| | T-075 | P1 | Run credentialed hosted browser E2E for full tab journeys (real page-to-page progression) | Engineering | IN_REVIEW | Auth-only hosted runs are green; next expand into the new POTS workspace workflow panel plus existing KB/Rapid Router flows and capture screenshots/artifacts | `cd frontend && npx playwright test e2e/auth.spec.ts --reporter=line` -> `6 passed`; `cd frontend && npx playwright test e2e/auth.full-flow.spec.ts --reporter=line` -> `1 passed` | |
| | T-074 | P1 | Implement non-Rapid tab UI polish pack from cross-tab advisory review | Engineering | IN_REVIEW | Phase-1 quick wins and automated deep-dive visual QA are complete (`21` viewport-tab runs, `0` visual issues); execute remaining phase-2/phase-3 structural interactions | `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `18 files / 54 tests passed`; visual audit `frontend/frontend/tmp/visual_audit/visual_audit_results.json` shows `failedRuns=0`, `totalVisualIssues=0` | |
| | T-073 | P1 | Simplify helper comparison-table UX to table-first output with clearer CTA | Engineering | DONE | Added table-detection/simplification in global helper and bypassed long-answer preview/details for table responses; aligned CTA wording across helper table renderers and published checkpoint commit | `npm --prefix frontend run build` -> success; commit `1014b78`; `git push origin main` + `git push hf-fourtab main` | |
| | T-072 | P1 | Publish router-ingestion checkpoint to required remotes | Engineering | DONE | Committed and pushed current router RAG mapping/report/doc updates to both required remotes | commit `8050c76`; `git push origin main` -> `21c3962..8050c76`; `git push hf-fourtab main` -> `21c3962..8050c76` | |
| | T-071 | P1 | Ingest new router knowledgebase corpus batch (EX400, RX400, ER815, IR624, Balance 310X) | Engineering | DONE | Added deterministic intake mappings and executed full intake pipeline on staged batch source; verified manifest/chunk inclusion and recall smoke pass | `bash backend/scripts/router_rag_intake_pipeline.sh ../tmp/router_rag_intake_2026-02-27_batch` -> `included=7`, `skipped=0`; `python3 backend/scripts/router_rag_smoke.py --query ...` -> `5 queries, 0 failures` | |
| | T-068 | P1 | Add basic CAPTCHA gate before Rapid Router order submit and first Knowledgebase/POTS/helper request | Engineering | DONE | Completed backend challenge/verify APIs + scoped token enforcement + frontend one-time gate cards + regression coverage | `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py backend/app/test_knowledgebase_api.py backend/app/test_chat_guidance_api.py backend/app/rapid_router/test_rapid_router_core.py` -> `57 passed`; `npm --prefix frontend run build` -> success | |
| | T-059 | P1 | Add Rapid Router CSV ingestion validator + dry-run import path (schema/lint + duplicate/SKU checks + preview) | Engineering | DONE | Completed core CSV validator + duplicate checks + dry-run/apply path + admin API endpoint + regression tests | `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `39 passed` | |
| | T-067 | P1 | Execute Rapid Router 10-point readability/simplicity cleanup pass | Engineering | IN_REVIEW | Core 3-phase refactor is implemented in `RapidRouter.tsx`; run desktop/mobile visual QA and capture any spacing/copy nits before marking done | `npm --prefix frontend run build` -> success; visual QA checklist pass for step header, staged actions, fix-list-only validation, helper readability, and admin modal flow | |
| | T-069 | P1 | Implement user-requested 12-point Rapid Router + global UI visibility overhaul | Engineering | IN_REVIEW | Deep-dive compliance pass applied final cleanup patches (remove leftover column-focus/copy controls and unify helper compare label to `Device details`); perform hosted browser QA/deploy smoke | `npm --prefix frontend run build` -> success; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `45 passed, 9 warnings`; `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py` -> `88 passed, 9 warnings` | |
| | T-070 | P1 | Run a targeted visual polish sprint for Rapid Router and shared rails/cards | Engineering | IN_REVIEW | Complete hosted desktop/tablet/mobile screenshot QA for the newly shipped polish pass and capture any residual spacing/copy nits before publish | `npm --prefix frontend run build` -> success; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `45 passed, 9 warnings` | |
| | T-066 | P1 | Profile HF Space startup/wake latency with real runtime timings and recommend env tuning | Engineering | IN_REVIEW | Capture startup stage timings from runtime logs/health (`bootstrap`, `csv_sanity`, `preload`, `integrity`) and decide preload policy (`light` vs `none`) for production | HF boot logs include per-stage timing and restart median improves without regressions | |
| | T-063 | P2 | Clean up third-party deprecation warning noise in Rapid Router test runs (`reportlab` + SWIG/PyMuPDF) | Engineering | DONE | Added narrowly scoped warning filters/containment around vetted external noise while preserving real exception visibility | `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `151 passed` with warning noise contained | |
| | T-065 | P2 | Contain known benign MuPDF startup font-warning noise from seed-doc setup-note extraction | Engineering | DONE | Wrapped setup-note extraction in targeted stderr containment to suppress known benign font spam only | Startup probe `python3 - <<'PY' ... RapidRouterCore(...) ... PY` now prints clean `startup_ok 12` without repeated MuPDF font warning | |
| | T-060 | P1 | Add Rapid Router <-> Knowledgebase catalog sync contract checks | Engineering | DONE | Added contract test asserting seeded Rapid Router catalog remains queryable via KB fast paths and provider wiring | `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `151 passed` | |
| | T-061 | P1 | Add per-stage latency instrumentation and SLO guardrails for KB helper paths | Engineering | DONE | Added per-stage timing fields + stage SLO evaluation in eval script and shard aggregate summary output | `cd backend && python3 -m pytest -q app/test_unified_kb_eval150_script.py` -> `6 passed` | |
| | T-062 | P1 | Strengthen store schema-version migration tests and strict validation for Rapid Router store JSON | Engineering | DONE | Hardened migration/load paths for malformed versions/products/prices and added regression coverage | `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` (included in Phase 4 gate `151 passed`) | |
| | T-058 | P1 | Integrate Rapid Router store data into Unified Knowledgebase router-docs answers | Engineering | DONE | Completed provider injection + deterministic Rapid Router fast paths + fallback coverage tests | `cd backend && python3 -m pytest -q app/test_unified_kb_core.py app/test_knowledgebase_api.py app/rapid_router/test_rapid_router_core.py` -> `92 passed`; manual API check of `/api/knowledgebase/message` with `mode=router_docs` returned `deterministic_rapid_router_catalog_list_fast` with `rapid_router_store.json` sources | |
| | T-057 | P1 | Validate first-login/re-login with real Auth0 credentials in auth-required runtime | Engineering | DONE | Credentialed hosted login/logout verification passed after audience-optional + legacy-placeholder-ignore auth fixes | `cd frontend && npx playwright test e2e/auth.full-flow.spec.ts --reporter=line` -> `1 passed`; `cd frontend && npx playwright test e2e/auth.spec.ts --reporter=line` -> `6 passed` | |
| | T-056 | P1 | Run a focused UX cleanup pass for Rapid Router/toolbox (progressive disclosure + clearer hierarchy) | Engineering | DONE | Completed full 10-item UX pass in one batch (summary rail, completion chips, jump links, table view, review modal, mobile sticky CTA, and helper readability controls) | `cd frontend && npm run build`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py` | |
| | T-055 | P0 | Implement MSRP + Masters contact dropdown + configuration-options pricing in Rapid Router | Engineering | DONE | Completed and pushed in commit `176ff8f` | `cd backend && python3 -m pytest app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py app/test_tab_final_pass_matrix.py -q` -> `31 passed`; `cd frontend && npm run build` passed | |
| | T-054 | P2 | Track Rapid Router file-size growth during helper rollout | Engineering | IN_REVIEW | Monitor `RapidRouter.tsx` line growth and decide if helper should be split into dedicated component/module | `frontend/src/pages/RapidRouter.tsx` line count trend captured across releases | |
| | T-053 | P1 | Add Rapid Router in-page helper chatbot for router selection + rep Q&A | Engineering | IN_REVIEW | Run targeted rep-prompt QA, add feature flag, and add focused frontend tests for helper interactions | Helper responses remain within timeout budget and do not regress Rapid Router submit flow | |
| | T-043 | P1 | Recover local access to `backend/app/test_unified_kb_core.py` in Dropbox workspace and commit pending local delta | Engineering | DONE | Confirmed file readability and successful targeted pytest execution from Dropbox workspace | `wc -l backend/app/test_unified_kb_core.py` read succeeds; `cd backend && python3 -m pytest -q app/test_unified_kb_core.py` -> `93 passed` | |
| | T-042 | P1 | Reduce long-tail latency on top 10 slow cases while preserving `150/150` pass rate | Engineering | TODO | Profile and trim delegate/web-fallback on `66,111,86,91,88,85,92,82,93,99` with per-phase budgets and cache hits | Re-run `CHUNK_SIZE=10 START_ID=1 END_ID=150` and target `p95 < 7000ms`, `pass=150` | |
| | T-037 | P1 | Post-commit stabilization for residual 75-case failure (`ID 75`) | Engineering | TODO | Reproduce case `75` with profiler traces and patch mixed Verizon gateway + POTS end-to-end response path | `docs/evals/shards5_eval75/unified_kb_eval150_shards10_summary.json` shows no failed IDs | |
| | T-064 | P2 | Stabilize Rapid Router 25-case suite residual semantic miss (`ID 3`) | Engineering | TODO | Inspect `shards5_rapidrouter25` case `3` response wording and tighten quote-clarification template for W1850 ambiguity without relaxing guardrails | Re-run `CHUNK_SIZE=5 START_ID=1 END_ID=25 CASES_PATH=../docs/evals/unified_kb_eval25_rapid_router_cases.json OUT_DIR=../docs/evals/shards5_rapidrouter25` and target `25/25` | |
| | T-029 | P1 | Eliminate 75-case p95 regression versus legacy baseline (`318.1ms`) while preserving pass rate | Engineering | TODO | Profile slow 75 shards (`58-64` cluster) and reduce tail in POTS compare/assumption paths | `p95 <= 318.1ms` target vs `docs/evals/shards5_eval75/unified_kb_eval75_shards5_summary.json` | |
| | T-030 | P1 | Finalize commit policy for root `docs/faq/FAQ_ongoing_candidates.csv` churn | Engineering | DONE | Adopted pytest-time isolation policy via backend `conftest.py` so local regressions default to temp FAQ candidate path unless explicitly overridden | Repeated test runs preserve root FAQ hash (`sha256` unchanged before/after targeted pytest) | |
| | T-031 | P2 | Add focused tests for `_parallel_index_search` budget behavior under slow index calls | Engineering | DONE | Added deterministic slow-stub tests for bounded in-flight submission and shared executor reuse | `cd backend && python3 -m pytest -q app/test_unified_kb_core.py` -> `93 passed` | |
| | T-034 | P2 | Add dedicated latency guard tests for long-form POTS rewrite path in conversation regression suite | Engineering | DONE | Added long-form single-turn and cumulative-turn latency guard tests in POTS regression suite | `cd backend && python3 -m pytest -q app/test_pots_conversation_regression.py` -> `3 passed` | |
| | T-036 | P1 | Clear remaining 75-case failure (`ID 75`) without degrading other MSRP/Verizon intents | Engineering | TODO | Reproduce case `75`, adjust mixed Verizon gateway + POTS synthesis response to satisfy semantic scorer while preserving guardrails | `docs/evals/shards5_eval75/unified_kb_eval150_shards10_summary.json` shows no failed IDs | |
| | T-038 | P2 | Prevent test-run churn in root `docs/faq/FAQ_ongoing_candidates.csv` during local regressions | Engineering | DONE | Added session-level fixture that routes FAQ candidate writes to temp path by default in test runs | Root FAQ candidate file hash remains stable after repeat pytest runs under default test env | |
|
|
| ## Backlog |
|
|
| | ID | Priority | Task | Owner | Notes | |
| |---|---|---|---|---| |
| | D-240 | Added four new required Rapid Router approval attestations (180-day commitment, quote approval before IMEI release, active MDN before shipment, and truth/correctness) with matching frontend + backend validation and persistence | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `53 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success | |
| | B-007 | P2 | Deferred by instruction: add Paste order lines parser (`5 CR602, 2 RX60`) to auto-fill quantities/models | Engineering | Explicitly excluded from current execution cycle; revisit only on direct user re-approval | |
| | B-005 | P2 | Add optional cleanup hook for shared search executor in long-lived workers | Engineering | Defensive hardening for unusual shutdown environments | |
| | B-006 | P2 | Add script-level tests for shard runner `TREND_FILE`/FAQ out-dir isolation defaults | Engineering | Guard against regressions in eval tooling | |
|
|
| ## Done (Recent) |
|
|
| | ID | Task | Completed On | Evidence | |
| |---|---|---|---| |
| | D-226 | Rapid Router review validation links now open the target accordion chain and focus the exact invalid field, fixing closed-accordion jumps in customer/order sections | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `8 passed`; `cd frontend && npm run build` -> success | |
| | D-251 | Pruned non-canonical timestamped eval history and local workspace clutter while preserving the canonical eval baselines, runner assets, and cleanup policy docs | 2026-03-07 | `backend/scripts/cleanup_repo_artifacts.py`, `docs/evals/README.md`; `python3 backend/scripts/cleanup_repo_artifacts.py --no-backup` -> `removed_dirs=75`, `removed_files=62`; `python3 backend/scripts/cleanup_repo_artifacts.py --dry-run --no-backup` -> `0` pending removals; canonical probes for `latest_eval25_guarded_gpt_check`, `latest_eval50_guarded_gpt_check`, `latest_eval6_concept_check`, `release_gate`, `shards10`, and `shards5_eval75` all returned OK | |
| | D-233 | Split the remaining dirty worktree into auditable cleanup batches, refreshed the stale Rapid Router final-pass matrix fixture, reran the full backend suite clean, normalized visible-copy casing on remaining active frontend surfaces, and archived timestamped eval reruns outside the repo | 2026-03-07 | `backend/app/knowledgebase/core.py`, `backend/app/router_rag/core.py`, `backend/app/test_router_rag_module.py`, `backend/app/test_tab_final_pass_matrix.py`, `backend/app/test_unified_kb_core.py`, `frontend/src/components/PromptCoach.tsx`, `frontend/src/pages/MastersAI.tsx`, `frontend/src/pages/PotsIntake.tsx`, `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/RouterKnowledgebase.tsx`, `frontend/src/pages/RoutersAssistant.tsx`, `frontend/src/pages/TelcoCalculator.tsx`, `frontend/src/pages/TelcoCalculator.test.tsx`; `cd backend && .venv/bin/python -m pytest -q app/test_tab_final_pass_matrix.py -k rapid_router_final_pass_30_case_matrix` -> `1 passed`; `cd backend && bash scripts/test_backend.sh --full` -> `523 passed`; `cd frontend && npx vitest run src/pages/TelcoCalculator.test.tsx --reporter=dot` -> `2 passed`; `cd frontend && npm run build` -> success; timestamped eval reruns archived at `/Users/petedunn/Desktop/codex_eval_archives/cleanup_eval_artifacts_20260307.tar.gz` | |
| | D-241 | Fixed Rapid Router order-options completion so advanced notes are optional when at least one advanced checkbox is selected, matching backend validation and removing the false review blocker | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `7 passed`; `cd frontend && npm run build` -> success; `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `54 passed` | |
| | D-242 | Normalized visible capitalization across active frontend surfaces so form labels/actions use sentence case by default and title case is reserved for structural headings/proper nouns | 2026-03-07 | `frontend/src/components/PromptCoach.tsx`, `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/TelcoCalculator.tsx`, `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/PotsIntake.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/RouterKnowledgebase.tsx`, `frontend/src/pages/MastersAI.tsx`, `frontend/src/pages/PotsAssistant.tsx`, `frontend/src/pages/RoutersAssistant.tsx`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx src/pages/TelcoCalculator.test.tsx src/components/FloatingRouterHelper.test.tsx --reporter=dot` -> `13 passed`; `cd frontend && npm run build` -> success | |
| | D-225 | Rapid Router review validation links now open the target accordion and focus the exact invalid field, fixing the closed-accordion navigation bug in customer/order sections | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `7 passed`; `cd frontend && npm run build` -> success | |
| | D-243 | Rapid Router advanced configuration notes now become optional when any advanced task checkbox is selected; notes remain required only for advanced requests with no selected checkbox option | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `cd backend && .venv/bin/python -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `53 passed`; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `6 passed`; `cd frontend && npm run build` -> success | |
| | D-250 | Updated Rapid Router so the flow starts on `Browse`, defaults payment to `BoBo`, requires a 7-digit BoBo Bill-to phone, and requires customer-information authorization/communication consent plus an authorization-provider name before submit; synced backend order normalization/PDF/email output to persist those fields | 2026-03-07 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `cd backend && .venv/bin/python -m pytest -x -vv app/rapid_router/test_rapid_router_core.py` -> `28 passed`; `cd backend && .venv/bin/python -m pytest -q app/test_rapid_router_api_shell.py` -> `24 passed`; frontend `tsc` passed but frontend Vitest/build stalled after startup under the current unified-exec saturation | |
| | D-223 | Added shared preferred-public-source guidance to every active server-side web-assisted assistant path so web fallback now explicitly prefers `opendevelopment.verizonwireless.com`, `masterstelecom.com`, and `5gstore.com` when relevant | 2026-03-07 | `backend/app/assistant_fallback.py`, `backend/app/knowledgebase/core.py`, `backend/app/router_rag/core.py`, `backend/app/masters_ai/core.py`, `backend/app/pots_ai/core.py`; `python3 -m py_compile backend/app/assistant_fallback.py backend/app/router_rag/core.py backend/app/masters_ai/core.py backend/app/pots_ai/core.py backend/app/knowledgebase/core.py backend/app/test_router_rag_module.py backend/app/test_unified_kb_core.py backend/app/test_masters_conversation_regression.py backend/app/test_pots_conversation_regression.py` -> success; direct smoke -> `SMOKE_OK` | |
| | D-238 | Optimized the three dominant broad-suite latency buckets enough to restore `75/75` and `150/150` accuracy with zero stage-budget exits, while leaving a smaller deterministic tail-latency cleanup open for a follow-up pass | 2026-03-07 | `backend/app/assistant_fallback.py`, `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`; `cd backend && .venv/bin/python -m pytest -q app/test_unified_kb_core.py -k 'masters_contact_center_doc_lookup_prefers_filename_match_without_search or masters_pots_materials_overview_prefers_doc_fast_without_search or pots_use_case_compare_prefers_cached_provider_fast or pots_provider_emphasis_summary_routes_fast or pots_generic_objection_prompt_skips_deep_search or pots_discovery_first_routes_to_concept_fast or router_inventory_audit_skips_concept_preflight or router_gateway_device_type_skips_concept_preflight'` -> `8 passed`; `bash backend/scripts/test_backend.sh --full` -> `510 passed`; `docs/evals/20260307_020040_eval75_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `75 / 75 passed`, `stage_budget_exit_rate_pct=0.0`; `docs/evals/20260307_020040_eval150_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `150 / 150 passed`, `stage_budget_exit_rate_pct=0.0` | |
| | D-239 | Added a TTL-backed keyed title cache for Masters mention lookups, proved the cache works under refresh suppression, and confirmed that the remaining `31/32/35/37` latency tail still lives in the delegate path rather than in title rescans | 2026-03-07 | `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`; `cd backend && .venv/bin/python -m pytest -q app/test_unified_kb_core.py -k 'masters_securefax_doc_lookup_prefers_file_discovery or masters_securefax_doc_lookup_uses_cached_title_rows_within_refresh_ttl or masters_contact_center_doc_lookup_prefers_filename_match_without_search or masters_pots_materials_overview_prefers_doc_fast_without_search'` -> `4 passed`; `bash backend/scripts/test_backend.sh --full` -> `511 passed`; `docs/evals/20260307_023133_eval150_masters_lookup_slice/unified_kb_eval150_31_37.json` -> `7 / 7 passed`, `avg_latency_ms=2499.04`, `p95_ms=4383.97` | |
| | D-237 | Fixed the `150` case-133 overblock by narrowing the code-adjudication regex to require code/inspection/AHJ context around `approved`/`approval`, and profiled the remaining broad-suite latency clusters before the next `75`/`150` rerun | 2026-03-07 | `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`; `cd backend && .venv/bin/python -m pytest -q app/test_unified_kb_core.py -k 'allows_approved_masters_references_outline_request or blocks_code_adjudication_globally or blocks_exact_current_lead_times_globally or blocks_exact_current_band_support_globally or blocks_exact_current_certification_status_globally or blocks_exact_current_lifecycle_date_globally or blocks_exact_current_availability_globally'` -> `7 passed`; direct case-133 spot-check returned `domain='masters'`, `retrieval_mode='masters_outline_fast'`, `timing_ms.total=4.37`; `bash backend/scripts/test_backend.sh --full` -> `502 passed`; Router RAG smoke -> `10 queries / 0 failures` | |
| | D-236 | Reran guarded-GPT `25`, `50`, `75`, and `150` against the current baselines, confirming `25` and `50` are stable while exposing new broad-suite latency tails and one `150` overblock (`ID 133`) that now define the next cleanup target | 2026-03-07 | `docs/evals/20260307_010031_eval25_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `25 / 25 passed`, `p95=499.50ms`; `docs/evals/20260307_010031_eval50_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `50 / 50 passed`, `p95=381.53ms`; `docs/evals/20260307_010031_eval75_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `75 / 75 passed`, `p95=3645.73ms`, `ab_gate.p95_non_regression=False`; `docs/evals/20260307_010031_eval150_guarded_gpt_rerun/unified_kb_eval150_shards10_summary.json` -> `149 / 150 passed`, `failed_ids=[133]`, `stage_budget_exit_rate_pct=1.33` | |
| | D-235 | Expanded the guarded-GPT concept pack to `50` reusable questions in `5`-question shards, added global early refusals for exact/current risky asks, and validated the broader pack at `50 / 50 passed` without latency regression | 2026-03-07 | `backend/scripts/run_unified_kb_eval50_guarded_gpt_chunks.sh`, `docs/evals/unified_kb_eval50_guarded_gpt_cases.json`, `backend/app/knowledgebase/core.py`, `backend/app/assistant_fallback.py`, `backend/app/test_assistant_fallback.py`, `backend/app/test_unified_kb_core.py`, `backend/app/test_masters_conversation_regression.py`; `cd backend && .venv/bin/python -m pytest -q app/test_assistant_fallback.py app/test_unified_kb_core.py -k 'contact_center or exact_current or code_adjudication or high_risk_code_compliance or lead_times_globally or band_support_globally or certification_status_globally or lifecycle_date_globally or availability_globally'` -> `11 passed`; `set -a && source .env.codex && set +a && cd backend && OUT_DIR=../docs/evals/latest_eval50_guarded_gpt_check ./scripts/run_unified_kb_eval50_guarded_gpt_chunks.sh` -> `50 / 50 passed`; `bash backend/scripts/test_backend.sh --full` -> `501 passed` | |
| | D-234 | Implemented Phase 1 and Phase 2 together by hardening blocked-case tests, fixing false-positive regulatory matching, narrowing strict-citation gating for safe concept explainers, and expanding deterministic concept preflight so the reusable `25`-case guarded-GPT pack now runs `25 / 25 passed` with the POTS concept shard in low-millisecond latency | 2026-03-06 | `backend/app/assistant_fallback.py`, `backend/app/knowledgebase/core.py`, `backend/app/pots_ai/core.py`, `backend/app/test_assistant_fallback.py`, `backend/app/test_pots_conversation_regression.py`, `backend/app/test_unified_kb_core.py`; `cd backend && .venv/bin/python -m pytest -q app/test_assistant_fallback.py app/test_pots_conversation_regression.py app/test_unified_kb_core.py -k 'ul_substring or real_ul_compliance or replacement_plain_english or multisite_stays_internal_fast or dual_pathway or copper_sunset'` -> `7 passed`; `cd backend && .venv/bin/python -m pytest -q app/test_assistant_fallback.py app/test_router_rag_module.py app/test_masters_conversation_regression.py app/test_pots_conversation_regression.py app/test_unified_kb_core.py` -> `205 passed`; `set -a && source .env.codex && set +a && backend/scripts/run_unified_kb_eval25_guarded_gpt_chunks.sh` -> `25 / 25 passed`; `bash backend/scripts/test_backend.sh --full` -> `493 passed` | |
| | D-244 | Standardized all active backend/runtime LLM defaults, current env examples, and local repo env pins on `gpt-5-mini`, fixed the POTS GPT-5 temperature incompatibility, and revalidated the guarded-GPT acceptance pack at `25 / 25 passed` | 2026-03-06 | `README.md`, `backend/.env.test.example`, `.env.codex`, `backend/.env.codex`, `backend/app/main.py`, `backend/app/chat_nlu.py`, `backend/app/knowledgebase/core.py`, `backend/app/router_rag/core.py`, `backend/app/masters_ai/core.py`, `backend/app/pots_ai/core.py`, `backend/app/routers/router_core.py`, `backend/app/test_pots_conversation_regression.py`, `backend/scripts/unified_kb_eval150.py`, `backend/scripts/router_rag_eval50.py`, `backend/scripts/router_rag_smoke.py`, `backend/scripts/run_unified_kb_eval150_chunks.sh`; `cd backend && .venv/bin/python -m pytest -q app/test_pots_conversation_regression.py -k 'concept_fallback_for_generic_pots_question or llm_synthesis_omits_temperature_for_gpt5_models'` -> `2 passed`; `bash backend/scripts/test_backend.sh --full` -> `478 passed`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `31 files / 111 passed`; `docs/evals/20260306_230403_eval25_gpt5mini_default/unified_kb_eval150_shards10_summary.json` -> `25 / 25 passed` | |
| | D-245 | Added a canonical reusable 25-question guarded-GPT eval pack (`5` shards of `5`) plus a dedicated shard runner, then stabilized the suite to `24 / 25 passed` (`96.0%`) with only residual case `13` left open | 2026-03-06 | `docs/evals/unified_kb_eval25_guarded_gpt_cases.json`, `backend/scripts/run_unified_kb_eval25_guarded_gpt_chunks.sh`, `docs/evals/README.md`, `docs/evals/latest_eval25_guarded_gpt_check/unified_kb_eval150_shards10_summary.json`; targeted reruns for IDs `6,7,8,11,15`; final aggregate `24 / 25 passed`, `failed_ids=[13]` | |
| | D-246 | Added a shared assistant-family concept-fallback module with allow/deny gates, `gpt-5-mini` model-only fallback, explicit provenance labels, fallback-only `+4s` budget extension, and GPT+web refinement only when the concept answer still needed current/public information | 2026-03-06 | `backend/app/assistant_fallback.py`, `backend/app/knowledgebase/core.py`, `backend/app/router_rag/core.py`, `backend/app/masters_ai/core.py`, `backend/app/pots_ai/core.py`, `backend/app/main.py`, `frontend/src/utils/chatProvenance.ts`, `frontend/src/components/chat/ConversationHeader.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/RouterKnowledgebase.tsx`, `frontend/src/pages/MastersAI.tsx`, `frontend/src/pages/PotsAssistant.tsx`, `frontend/src/pages/RoutersAssistant.tsx`; `cd backend && .venv/bin/python -m pytest -q app/test_assistant_fallback.py app/test_unified_kb_core.py app/test_router_rag_module.py app/test_masters_conversation_regression.py app/test_pots_conversation_regression.py app/test_chat_guidance_api.py app/test_knowledgebase_api.py` -> `202 passed`; `bash backend/scripts/test_backend.sh --full` -> `477 passed`; `cd frontend && npm run test` -> `31 files / 111 passed`; `docs/evals/latest_eval6_concept_check/unified_kb_eval150_shards10_summary.json` -> `6 / 6 passed` | |
| | D-247 | Completed the requested full validation sweep: backend full suite green, frontend full suite green, live Playwright reduced to one hosted POTS provider-coverage miss, and OpenAI shard suites landed at `146/150 (97.3%)`, `73/75 (97.3%)`, and `50/50 (100%)`; also patched local provider-card building to backfill missing providers such as `MetTel` from indexed evidence mapped back to known files | 2026-03-06 | `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`; `bash backend/scripts/test_backend.sh --full` -> `459 passed` + Router RAG smoke `10/10`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `30 files / 106 passed`; `cd frontend && npx playwright test --config=playwright.config.ts` -> `9 passed / 1 failed / 4 skipped`; `docs/evals/20260306_190557_eval150_rerun/unified_kb_eval150_shards10_summary.json`; `docs/evals/20260306_192259_eval75_rerun/unified_kb_eval150_shards10_summary.json`; `docs/evals/20260306_193023_eval50_rerun/unified_kb_eval150_shards10_summary.json`; focused regressions `2 passed` + `2 passed` | |
| | D-229 | Enforced the current UI-lock scan rules by removing collapsed-state banners, hiding the default header status button, demoting coach/browse actions that competed with the page primary CTA, and consolidating Rapid Router stage progression under the sticky cart | 2026-03-06 | `frontend/src/components/AssistantWorkspace.tsx`, `frontend/src/components/ConversationalSidePanel.tsx`, `frontend/src/components/PromptCoach.tsx`, `frontend/src/components/BrandHeader.tsx`, `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/AssistantWorkspace.test.tsx src/components/PromptCoach.test.tsx src/components/BrandHeader.test.tsx src/pages/RapidRouter.test.tsx --reporter=dot` -> `11 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `30 files / 105 passed`; `git diff --check` -> success | |
| | D-228 | Standardized `UnifiedKnowledgebase`, `RouterKnowledgebase`, `RoutersAssistant`, `MastersAI`, and `PotsAssistant` on one assistant shell with shared auto-collapsing setup, then added focused setup-panel regression coverage | 2026-03-06 | `frontend/src/components/AssistantWorkspace.tsx`, `frontend/src/components/AssistantWorkspace.test.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/RouterKnowledgebase.tsx`, `frontend/src/pages/RoutersAssistant.tsx`, `frontend/src/pages/MastersAI.tsx`, `frontend/src/pages/PotsAssistant.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/components/AssistantWorkspace.test.tsx src/components/PageArchetypes.test.tsx --reporter=dot` -> `4 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `30 files / 105 passed` | |
| | D-227 | Rebuilt `RapidRouter` into a staged commerce flow with one active step at a time, a sticky cart rail, and collapsed `Commerce tools`, then added focused regression coverage for the new behavior | 2026-03-06 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/RapidRouter.test.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/RapidRouter.test.tsx --reporter=dot` -> `2 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `29 files / 103 passed` | |
| | D-226 | Collapsed Telco assumptions, what-if mode, diagnostics, quote helpers, scenario JSON/CSV, and assistant coaching into one shared `Advanced` drawer so the default calculator surface stays on the business flow | 2026-03-06 | `frontend/src/pages/TelcoCalculator.tsx`, `frontend/src/pages/TelcoCalculator.test.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/TelcoCalculator.test.tsx --reporter=dot` -> `2 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `28 files / 101 passed` | |
| | D-248 | Rebuilt `TelcoCalculator` as a four-step sequence (`Locations`, `Pricing`, `Results`, `Export`) and added regression coverage for the new step flow | 2026-03-06 | `frontend/src/pages/TelcoCalculator.tsx`, `frontend/src/pages/TelcoCalculator.test.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/TelcoCalculator.test.tsx --reporter=dot` -> `1 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `28 files / 100 passed` | |
| | D-224 | Replaced paragraph-style POTS instructions with a shared three-line `StepGuide` pattern across the merged estimate/intake flow so each step now states what it does, what is needed now, and what happens next | 2026-03-06 | `frontend/src/components/ui.tsx`, `frontend/src/pages/PotsEstimateIntake.tsx`, `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/PotsIntake.tsx`, `frontend/src/pages/PotsEstimateIntake.test.tsx`, `frontend/src/pages/PotsSavingsEstimator.test.tsx`, `frontend/src/pages/PotsIntake.test.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsEstimateIntake.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `23 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 99 passed` | |
| | D-249 | Flattened the embedded `PotsEstimateIntake` shell by adding explicit embedded-mode rendering to the merged wrapper, estimator, and intake so the combined flow no longer feels like nested full-page cards | 2026-03-06 | `frontend/src/pages/PotsEstimateIntake.tsx`, `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/PotsIntake.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsEstimateIntake.test.tsx src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `23 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 99 passed` | |
| | D-222 | Converted `PotsWorkspace` routing into a one-question-at-a-time conversation with answer cards, compact `Why this matters` disclosure, and a final review/edit step while preserving the existing triage payload | 2026-03-06 | `frontend/src/pages/PotsWorkspace.tsx`, `frontend/src/pages/PotsWorkspace.test.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `10 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 99 passed` | |
| | D-221 | Moved active-project creation and saved-project switching into the `Project tools` drawer so `PotsWorkspace` no longer shows setup UI in the main wizard by default | 2026-03-06 | `frontend/src/pages/PotsWorkspace.tsx`, `frontend/src/pages/PotsWorkspace.test.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `9 passed`; `cd frontend && npm run build` -> success | |
| | D-220 | Converted `PotsWorkspace` from a stacked dashboard into a true wizard shell with one active step card plus a focused utilities drawer for routing/intake | 2026-03-06 | `frontend/src/pages/PotsWorkspace.tsx`, `frontend/src/pages/PotsWorkspace.test.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `8 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `27 files / 97 passed`; `git diff --check` -> success | |
| | D-219 | Completed the app-wide destructive-action confirmation sweep, added shared confirmation helper + cancel-aware slash resets, and covered the highest-risk cancel paths with focused frontend regression tests | 2026-03-06 | `frontend/src/utils/confirmAction.ts`, `frontend/src/utils/chatCommands.ts`, `frontend/src/pages/PotsEstimateIntake.tsx`, `frontend/src/pages/PotsIntake.tsx`, `frontend/src/pages/PotsWorkspace.tsx`, `frontend/src/pages/TelcoCalculator.tsx`, `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/RouterKnowledgebase.tsx`, `frontend/src/pages/MastersAI.tsx`, `frontend/src/pages/PotsAssistant.tsx`, `frontend/src/pages/RoutersAssistant.tsx`, `frontend/src/components/FloatingRouterHelper.tsx`; `cd frontend && npx tsc -p tsconfig.json --noEmit` -> success; `cd frontend && npx vitest run src/utils/chatCommands.test.ts src/utils/confirmAction.test.ts src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx src/pages/PotsIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `27 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `24 files / 86 passed` | |
| | D-218 | Finished the remaining POTS intake density pass, added focused intake/workspace regression coverage, and used desktop/mobile browser QA to justify a true single-open workspace accordion plus closed-by-default intake scope disclosures | 2026-03-06 | `frontend/src/pages/PotsIntake.tsx`, `frontend/src/pages/PotsIntake.test.tsx`, `frontend/src/pages/PotsWorkspace.tsx`, `frontend/src/pages/PotsWorkspace.test.tsx`; `cd frontend && npx vitest run src/pages/PotsIntake.test.tsx src/pages/PotsEstimateIntake.test.tsx src/pages/PotsWorkspace.test.tsx --reporter=dot` -> `13 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `23 files / 79 passed`; local Playwright/browser QA at `1440x1024` and `390x844` confirmed the final disclosure defaults | |
| | D-217 | Simplified the active POTS estimate/intake experience by hiding support chrome behind disclosures, gating estimate inputs behind customer basics, and collapsing the full estimate math until results are requested | 2026-03-06 | `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/PotsEstimateIntake.tsx`, `frontend/src/pages/PotsIntake.tsx`, `frontend/src/pages/PotsSavingsEstimator.test.tsx`; `cd frontend && npx vitest run src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx --reporter=dot` -> `6 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `22 files / 73 passed` | |
| | D-216 | Clarified POTS estimator start paths with an explicit three-mode chooser and made intake seeding follow the selected mode (`quick estimate`, `totals now`, `site-by-site now`) | 2026-03-06 | `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/PotsSavingsEstimator.test.tsx`, `frontend/src/pages/PotsEstimateIntake.tsx`, `frontend/src/pages/PotsEstimateIntake.test.tsx`; `cd frontend && npx vitest run src/pages/PotsSavingsEstimator.test.tsx src/pages/PotsEstimateIntake.test.tsx --reporter=dot` -> `5 passed`; `cd frontend && npm run build` -> success; `cd frontend && npm run test` -> `22 files / 72 passed` | |
| | D-215 | Verified hosted auth-required runtime after removing legacy `masters-toolkit-api` audience dependency: smoke suite and full credentialed login/logout flow both passed | 2026-03-06 | `cd frontend && npx playwright test e2e/auth.spec.ts --reporter=line` -> `6 passed`; `cd frontend && npx playwright test e2e/auth.full-flow.spec.ts --reporter=line` -> `1 passed` | |
| | D-213 | Added dedicated Playwright coverage for Rapid Router two-user saved-profile isolation and enabled ignored local `frontend/.env.e2e(.local)` loading for repeatable credentialed hosted runs | 2026-03-05 | `frontend/e2e/rapid-router.memory-isolation.spec.ts`, `frontend/playwright.config.ts`, `frontend/e2e.env.template`, `frontend/package.json`; `npm --prefix frontend run build` -> success; `cd frontend && npx playwright test e2e/rapid-router.memory-isolation.spec.ts --list` -> `1 test listed` | |
| | D-214 | Removed active reliance on legacy Auth0 audience `https://masters-toolkit-api` by treating it as invalid/ignored in frontend and backend auth config, and added user-facing callback guidance for the exact `Service not found` error | 2026-03-06 | `frontend/src/auth/config.ts`, `frontend/src/auth/errorUtils.ts`, `frontend/src/auth/config.test.ts`, `frontend/src/auth/errorUtils.test.ts`, `backend/app/auth.py`, `backend/app/test_auth.py`; `cd frontend && npx vitest run src/auth/config.test.ts src/auth/errorUtils.test.ts src/components/HealthStatusModal.test.tsx` -> `16 passed`; `python3 -m pytest -q backend/app/test_auth.py backend/app/test_startup_rate_limit.py` -> `31 passed`; `npm --prefix frontend run build` -> success | |
| | D-212 | Scoped shared Smart Profile, resume cards, POTS carryover, and Rapid Router repeat-draft memory per authenticated user so customer data is no longer browser-global across logins | 2026-03-05 | `frontend/src/utils/customerMemory.ts`, `frontend/src/utils/customerMemory.test.ts`, `frontend/src/auth/AuthGate.tsx`, `frontend/src/main.tsx`, `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/utils/customerMemory.test.ts --pool=threads --maxWorkers=1` -> `4 passed` | |
| | D-211 | Fixed battery-router shortlist omission so removable option (`CR202-Lite`) is preserved for `best routers with batteries` and added regression coverage | 2026-03-05 | `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`; `PYTHONPATH=backend python3 -m pytest -q backend/app/test_unified_kb_core.py -k "battery_best_list_keeps_removable_option"` -> `1 passed`; runtime harness probe now returns `CR202-Lite` in battery options table | |
| | D-210 | Executed additional `150` rerun attempt to reach `>=95%` target and logged stochastic variance outcome for follow-up (`T-079`) | 2026-03-05 | `cd backend && CHUNK_SIZE=15 START_ID=1 END_ID=150 SEMANTIC_POLICY=all OUT_DIR=../docs/evals/20260305T021154_phase3_gate150_rerun2_final CASES_PATH=../docs/evals/unified_kb_eval150_cases.json ./scripts/run_unified_kb_eval150_chunks.sh` -> `141/150 (94.0%)`, failed IDs `[48,55,78,89,99,107,110,112,118]`, `stage_budget_exits=0` | |
| | D-209 | Completed gameplan Phase 3 evaluation verification gate command set (`150/75/50`) with quality floor maintained and docs/eval artifacts published | 2026-03-05 | `150`: `142/150 (94.7%)`, failed `[24,36,88,98,99,104,112,129]` (`docs/evals/20260305T013817_phase3_gate150_final/unified_kb_eval150_shards10_summary.json`); `75`: `74/75 (98.7%)`, failed `[3]` (`docs/evals/20260305T015614_phase3_gate75_final/unified_kb_eval150_shards10_summary.json`); `50`: `50/50 (100.0%)`, failed `[]` (`docs/evals/20260305T020530_phase3_gate50_final/unified_kb_eval150_shards10_summary.json`) | |
| | D-208 | Completed gameplan Phase 2 consolidation verification gate and moved `T-076`/`T-077` to hosted sign-off (`IN_REVIEW`) | 2026-03-05 | `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 59 tests passed`; `python3 -m pytest -q backend/app/test_knowledgebase_api.py backend/app/routers/router_tab_smoke_test.py backend/app/test_tab_final_pass_matrix.py backend/app/test_pots_response_contract.py backend/app/test_pots_conversation_regression.py` -> `68 passed` | |
| | D-207 | Completed gameplan Phase 5 repo/tooling hygiene hardening (`T-030`, `T-038`, `T-031`, `T-034`, `T-043`) with verification reruns | 2026-03-05 | Added `backend/app/conftest.py` fixture isolation + warning filters; added `_parallel_index_search` budget tests and POTS long-form latency guard tests; verified `cd backend && python3 -m pytest -q app/test_unified_kb_core.py app/test_pots_conversation_regression.py app/test_unified_kb_eval150_script.py` -> `102 passed`; root FAQ hash unchanged across repeat run | |
| | D-206 | Completed gameplan Phase 4 data/contract/migration hardening (`T-081`, `T-060`, `T-061`, `T-062`, `T-063`, `T-065`) | 2026-03-05 | Updated Crown deterministic facts in `feb2026routers.csv`; added Rapid Router/KB contract tests and store migration hardening/tests; added stage-timing SLO outputs in eval scripts; contained MuPDF/reportlab noise; phase gate `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `151 passed` | |
| | D-205 | Completed Phase 1 verification gate from the saved gameplan (frontend build/test + Rapid Router backend regression command set) | 2026-03-05 | `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `19 files / 59 tests passed`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `49 passed, 9 warnings` | |
| | D-204 | Completed Phase 0 auth verification gate from the saved gameplan with hosted URL substitution and explicit credential-blocker capture | 2026-03-05 | `cd frontend && npx vitest run src/auth/config.test.ts src/auth/errorUtils.test.ts` -> `13 passed`; `python3 -m pytest -q backend/app/test_auth.py` -> `21 passed`; `cd frontend && E2E_DISABLE_WEBSERVER=true E2E_BASE_URL=https://crazycrazypete-masters-four-tab-openai.hf.space npx playwright test e2e/auth.full-flow.spec.ts` -> `1 skipped` | |
| | D-203 | Saved next-thread phased execution gameplan for remaining fixes/enhancements and recorded explicit parser deferral scope | 2026-03-04 | `docs/dev/next_thread_remaining_fixes_enhancements_gameplan.md`; `T-086` added; `B-007` parser deferred | |
| | D-202 | Re-ran post-edit verification gate for Smart Profile/customer-memory + resume/carryover + KB action-chip rollout before handoff | 2026-03-04 | `git status --short`; `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/utils/customerMemory.test.ts --pool=threads --maxWorkers=1` -> `3 passed` | |
| | D-201 | Implemented shared Smart Profile/Customer Memory utility + resume/repeat work cards, hardened estimate->intake carryover replay, and Knowledgebase action chips to Router Helper / Rapid Router order draft | 2026-03-04 | `frontend/src/utils/customerMemory.ts`, `frontend/src/utils/customerMemory.test.ts`, `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/PotsEstimateIntake.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/RapidRouter.tsx`, `frontend/src/App.tsx`; `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/utils/customerMemory.test.ts --pool=threads --maxWorkers=1` -> `3 passed` | |
| | D-200 | Published consolidated checkpoint commit to both required remotes (`origin`, `hf-fourtab`) | 2026-03-04 | commit `fcd2934`; `git push origin main` -> `e1ec24c..fcd2934`; `git push hf-fourtab main` -> `e1ec24c..fcd2934` | |
| | D-199 | Added persistent header one-click Slack support chip in shared BrandHeader (global across tabs) | 2026-03-04 | `frontend/src/components/BrandHeader.tsx`, `frontend/src/components/BrandHeader.test.tsx`; `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/components/BrandHeader.test.tsx --pool=threads --maxWorkers=1` -> `4 passed` | |
| | D-198 | Added global floating support launcher with Slack-first CTA plus email/phone one-click fallback and command-palette open action | 2026-03-04 | `frontend/src/components/FloatingSupportLauncher.tsx`, `frontend/src/App.tsx`; `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/components/BrandHeader.test.tsx src/components/PromptCoach.test.tsx --pool=threads --maxWorkers=1` -> `5 passed` | |
| | D-197 | Implemented Rapid Router split-shipping locations for single-model orders with frontend + backend validation and order PDF/email location breakdown | 2026-03-04 | `frontend/src/pages/RapidRouter.tsx`; `backend/app/rapid_router/core.py`; `backend/app/rapid_router/test_rapid_router_core.py`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `25 passed, 6 warnings`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py` -> `24 passed, 9 warnings`; `npm --prefix frontend run build` -> success | |
| | D-196 | Imported Dragon quick guide + Spark/Kadet docs into canonical Router RAG corpus and added deterministic Dragon/M106/M519/K500A/K300NB router-fact coverage with alias normalization | 2026-03-04 | Intake run `backend/scripts/router_rag_intake_pipeline.sh /tmp/router_rag_intake_2026-03-04_dragon_spark_kadet` -> `included=6`; canonical files under `_RAG_Ready_KB_Organized/01_documents/routers/connect_csg/` and `/routers/verizon/`; updated `feb2026routers.csv`; alias/routing updates in `backend/app/knowledgebase/core.py`; regression `python3 -m pytest -q backend/app/test_unified_kb_core.py -k dragon_and_katalyst_phrase_aliases` -> `1 passed` | |
| | D-195 | Extracted and reported exact failed-question lists (ID + query text) for recovered `150/75/50` suites from shard artifacts | 2026-02-28 | Parsed `results[]` where `pass=false` from `docs/evals/shards15_eval150_openai_all_20260227_fix12/`, `docs/evals/shards10_eval75_openai_all_20260227_fix8/`, `docs/evals/shards10_eval50_openai_all_20260227_fix7_full/`; counts `8`, `2`, `3` respectively | |
| | D-194 | Restored eval quality gate above 92% across all requested OpenAI suites and validated with targeted KB regressions | 2026-02-27 | `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py` -> `96 passed, 9 warnings`; `docs/evals/shards15_eval150_openai_all_20260227_fix12/unified_kb_eval150_shards10_summary.json` -> `142/150 (94.7%)`; `docs/evals/shards10_eval75_openai_all_20260227_fix8/unified_kb_eval150_shards10_summary.json` -> `73/75 (97.3%)`; `docs/evals/shards10_eval50_openai_all_20260227_fix7_full/unified_kb_eval150_shards10_summary.json` -> `47/50 (94.0%)` | |
| | D-193 | Patched Auth0 audience normalization to prefer non-trailing-slash identifier first, fixing login callback failures caused by `https://masters-toolkit-api/` audience selection | 2026-02-27 | `frontend/src/auth/config.ts`, `frontend/src/auth/config.test.ts`, `backend/app/auth.py`, `backend/app/test_auth.py`; `cd frontend && npx vitest run src/auth/config.test.ts src/auth/errorUtils.test.ts` -> `13 passed`; `python3 -m pytest -q backend/app/test_auth.py` -> `21 passed`; `npm --prefix frontend run build` -> success | |
| | D-192 | Executed requested OpenAI shard evaluation batch (`150 + 75 + new 50`) in 10 groups each and published aggregate summaries + failed-ID lists | 2026-02-27 | `docs/evals/shards10_eval150_openai_all_20260227/unified_kb_eval150_shards10_summary.json` -> `119/150`; `docs/evals/shards10_eval75_openai_all_20260227/unified_kb_eval150_shards10_summary.json` -> `73/75`; `docs/evals/shards10_eval50_openai_all_20260227/unified_kb_eval150_shards10_summary.json` -> `23/50`; generated case pack `docs/evals/unified_kb_eval50_new_questions_router_helper_cases.json` | |
| | D-191 | Implemented first-pass single-workspace convergence: added unified `POTS Estimates + Intake` tab flow with estimator handoff + fresh-start session behavior, plus Knowledgebase action/global command to launch floating router helper from any page | 2026-02-27 | `frontend/src/pages/PotsEstimateIntake.tsx`, `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/App.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/components/FloatingRouterHelper.tsx`; `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `18 files / 54 tests passed`; `python3 -m pytest -q backend/app/test_tab_final_pass_matrix.py backend/app/test_knowledgebase_api.py backend/app/routers/router_tab_smoke_test.py` -> `63 passed, 9 warnings` | |
| | D-190 | Completed local cross-tab validation sweep and fixed two test roadblocks (router compare fallback scenario stability + fast E2E skip on non-frontend base URLs) | 2026-02-27 | `python3 -m pytest -q backend/app` -> `357 passed`; `python3 -m pytest -q backend/app/test_tab_final_pass_matrix.py` -> `4 passed`; `npm --prefix frontend run test` -> `18 files / 54 tests`; `BASE_URL=http://127.0.0.1:4173/ node frontend/tmp/visual_audit/run_visual_audit.mjs` -> `21 runs, 0 issues`; `backend/app/routers/router_tab_smoke_test.py`; `frontend/e2e/upload.features.spec.ts` | |
| | D-189 | Removed `recommended` wording from Knowledgebase `Mode options` copy (`Auto` line) while preserving mode behavior text | 2026-02-27 | `frontend/src/pages/UnifiedKnowledgebase.tsx`; `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `18 files / 54 tests passed` | |
| | D-188 | Consolidated Knowledgebase answer metadata (`Why`, `Next action`, `Files`, `Sources`) into a single collapsed `Response details` accordion | 2026-02-27 | `frontend/src/pages/UnifiedKnowledgebase.tsx`; `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `18 files / 54 tests passed` | |
| | D-187 | Imported IR302 doc batch (quick guide/user manual/spec), rebuilt chunks, and added deterministic IR302 fact row with MSRP `$179.00` | 2026-02-27 | `backend/scripts/router_rag_import_corpus.py` mapping additions; `bash backend/scripts/router_rag_intake_pipeline.sh tmp/router_rag_intake_2026-02-27_ir302` -> `included=3`; reports `docs/reports/router_rag_intake_ir302_20260227TIR302.csv/.md`; `feb2026routers.csv` IR302 row; API probe (`router_docs`) returned `deterministic_router_fact_index` with `MSRP $179.00` | |
| | D-186 | Verified RV50X Feb2022 `-F` upload is already in canonical corpus via duplicate-hash mapping and added deterministic RV50X host-interface fact coverage (single Ethernet + RS-232 serial) with regression test | 2026-02-27 | `python3 backend/scripts/router_rag_import_corpus.py --source-dir /tmp/rv50x-intake-* --data-dir _RAG_Ready_KB_Organized ...` -> `skipped=1 (duplicate_hash -> Semtech-RV50X-Data Sheet-Feb2022.pdf)`; `feb2026routers.csv` RV50X row added; `backend/app/test_unified_kb_core.py` new RV50X host-interface test; `python3 -m pytest -q backend/app/test_unified_kb_core.py -k \"router_fact_fast_path_from_csv or rv50x_host_interfaces_include_single_ethernet_and_serial\"` -> `2 passed`; `python3 -m pytest -q backend/app/test_knowledgebase_api.py` -> `7 passed, 9 warnings` | |
| | D-185 | Generated and executed an ungraded 50-question Knowledgebase batch and captured full raw responses for manual review/scoring | 2026-02-27 | `docs/evals/kb_50_new_questions_results_2026-02-27.json`, `docs/evals/kb_50_new_questions_results_2026-02-27.md`; `python3 - <<'PY' ... FastAPI TestClient batch ... PY` -> `50/50` HTTP 200 in `~16.0s` | |
| | D-184 | Replaced Rapid Router primary logo asset with user-provided arrow-logo variant (asset-only swap; no layout logic changes) | 2026-02-27 | `frontend/public/rapid-router-primary-logo.png`; source `/Users/petedunn/Library/Containers/com.apple.Preview/Data/tmp/PreviewTemp-QpJOdK/Untitled Image 3.png`; `npm --prefix frontend run build` -> success | |
| | D-183 | Completed deep-dive multi-viewport render verification and patched residual overflow hotspots (BrandHeader mobile wrapping, Rapid setup-note URL wrapping, Rapid signature block overflow containment) | 2026-02-27 | `frontend/src/components/BrandHeader.tsx`, `frontend/src/pages/RapidRouter.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/RouterKnowledgebase.tsx`, `frontend/src/pages/RoutersAssistant.tsx`; `npm --prefix frontend run build` -> success; `npm --prefix frontend run test` -> `54 passed`; visual audit `frontend/frontend/tmp/visual_audit/visual_audit_results.json` -> `21 runs, 0 failures, 0 visual issues` | |
| | D-182 | Executed phase-1 non-Rapid cross-tab UI polish pass (shared chat table renderer + sticky composers + Telco table readability + POTS side-rail/flow quick wins) | 2026-02-27 | `frontend/src/components/chat/markdownTableComponents.tsx`, `frontend/src/components/chat/ChatComposer.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/RouterKnowledgebase.tsx`, `frontend/src/pages/RoutersAssistant.tsx`, `frontend/src/pages/TelcoCalculator.tsx`, `frontend/src/pages/PotsSavingsEstimator.tsx`, `frontend/src/pages/PotsIntake.tsx`; `npm --prefix frontend run build` -> success | |
| | D-181 | Added centered Rapid Router header logo block using new public asset and responsive framed hero treatment | 2026-02-27 | `frontend/src/pages/RapidRouter.tsx`, `frontend/public/rapid-router-primary-logo.png`; `npm --prefix frontend run build` -> success | |
| | D-180 | Completed non-Rapid tab UI/visual advisory sweep and produced per-tab advanced suggestion pack (no-code) | 2026-02-27 | Reviewed `frontend/src/App.tsx` + non-Rapid page components; recommendations delivered in chat; no runtime code changes | |
| | D-179 | Published helper comparison-table UX simplification checkpoint to both required remotes | 2026-02-27 | commit `1014b78`; `git push origin main` -> `087d265..1014b78`; `git push hf-fourtab main` -> `087d265..1014b78` | |
| | D-178 | Simplified helper comparison output to table-first UI and made CTA consistently explicit (`Click here for comparison table`) | 2026-02-27 | `frontend/src/components/FloatingRouterHelper.tsx`, `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success | |
| | D-177 | Published router-ingestion checkpoint commit to both required remotes | 2026-02-27 | commit `8050c76`; `git push origin main` -> `21c3962..8050c76`; `git push hf-fourtab main` -> `21c3962..8050c76` | |
| | D-176 | Processed and ingested 7 new router PDFs into canonical Router RAG corpus with deterministic mapping, parse/chunk rebuild, and recall verification | 2026-02-27 | `backend/scripts/router_rag_import_corpus.py`; `docs/reports/router_rag_intake_2026-02-27_batch_import_report_20260227T005515Z.csv` (`included=7`, `skipped=0`); `python3 backend/scripts/router_rag_smoke.py --query ...` -> `5/5 pass` | |
| | D-175 | Published Rapid Router UI polish/readability checkpoint to both required remotes | 2026-02-27 | commit `ac92a10`; `git push origin main` -> `9897015..ac92a10`; `git push hf-fourtab main` -> `9897015..ac92a10` | |
| | D-174 | Executed full Rapid Router/floating-helper visual polish batch from advisory list (density toggle, staged submit CTA hierarchy, compact right rail, expandable fix list, helper long-answer details) | 2026-02-27 | `frontend/src/pages/RapidRouter.tsx`, `frontend/src/components/FloatingRouterHelper.tsx`; `npm --prefix frontend run build` -> success; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `45 passed, 9 warnings` | |
| | D-173 | Published generalized Ericsson/CradlePoint `...50` non-WiFi alias mapping checkpoint to both required remotes | 2026-02-26 | commit `b3420ef`; `git push origin main` -> `aa0ddb8..b3420ef`; `git push hf-fourtab main` -> `aa0ddb8..b3420ef` | |
| | D-172 | Generalized Ericsson/CradlePoint `...50` model alias rule so non-WiFi variants map to `...00` base models (for example `AER2250` -> `AER2200`) with matching variant notes/Wi-Fi override behavior | 2026-02-26 | `backend/app/routers/router_core.py`, `backend/app/routers/router_tab_smoke_test.py`; `python3 -m pytest -q backend/app/routers/router_tab_smoke_test.py` -> `52 passed, 9 warnings` | |
| | D-171 | Published Rapid Router rail-width + currency-alignment patch checkpoint to both required remotes | 2026-02-26 | commit `00ea9d8`; `git push origin main` -> `70f3a5c..00ea9d8`; `git push hf-fourtab main` -> `70f3a5c..00ea9d8` | |
| | D-170 | Tightened Rapid Router right-rail width and hardened per-card currency rendering with split `$` + amount columns for stable symbol alignment in both pricing and unit/subtotal blocks | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success | |
| | D-169 | Unblocked POTS Intake line-inventory spreadsheet flow by enabling `Keep number / port needed?` toggles while keeping requirement enforcement | 2026-02-26 | `frontend/src/pages/PotsIntake.tsx`; `npm --prefix frontend run build` -> success | |
| | D-168 | Fixed Routers typo/parse reliability for inventory pastes: `12 RX60` no longer misparses as `12 R x60`, and likely transposed typo models now trigger confirmation (`RX60` -> `XR60`) before snapshot execution | 2026-02-26 | `backend/app/routers/router_core.py`, `backend/app/routers/router_tab_smoke_test.py`; `python3 -m pytest -q backend/app/routers/router_tab_smoke_test.py` -> `50 passed, 9 warnings` | |
| | D-167 | Fixed Routers inventory customer ownership carry-forward for `Customer has qty model, qty model...` syntax and added regression coverage for `Hoover has 200 IBR650, 12 AER2200, 16 MG51` | 2026-02-26 | `backend/app/routers/router_core.py`, `backend/app/routers/router_tab_smoke_test.py`; `python3 -m pytest -q backend/app/routers/router_tab_smoke_test.py` -> `47 passed, 9 warnings` | |
| | D-166 | Rebalanced Rapid Router column layout (narrower right rail, wider left router cards) and aligned dollar signs in both top pricing and `Unit/Subtotal` blocks | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success | |
| | D-165 | Committed and pushed dollar-sign alignment fix for Rapid Router pricing rows to both required remotes | 2026-02-26 | commit `ae70744`; `git push origin main` -> `8584959..ae70744`; `git push hf-fourtab main` -> `8584959..ae70744` | |
| | D-164 | Implemented explicit dollar-sign vertical alignment in product-card pricing by using a shared fixed-width value column and left-aligned currency strings | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success | |
| | D-163 | Committed and pushed follow-up laptop-width pricing-readability hardening to both required remotes | 2026-02-26 | commit `6312e7d`; `git push origin main` -> `fa21c6f..6312e7d`; `git push hf-fourtab main` -> `fa21c6f..6312e7d` | |
| | D-162 | Added follow-up pricing readability hardening: `xl` card density reduced to 3 columns and price rows use fixed value-column width for cleaner label/value separation | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success | |
| | D-161 | Committed and pushed pricing-overlap readability hotfix to both required remotes | 2026-02-26 | commit `dfd9f34`; `git push origin main` -> `07fc56e..dfd9f34`; `git push hf-fourtab main` -> `07fc56e..dfd9f34` | |
| | D-160 | Fixed Rapid Router product-card pricing readability by replacing overlapping compact pricing grid with row-based flex layout (`MSRP`, `Standard FWA`, `Backup / Pooled`) | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success | |
| | D-159 | Committed and pushed deep-dive Rapid Router/helper compliance checkpoint to both required remotes | 2026-02-26 | commit `2f4082e`; `git push origin main` -> `a957b5c..2f4082e`; `git push hf-fourtab main` -> `a957b5c..2f4082e` | |
| | D-158 | Ran deep-dive visual-compliance code audit for prior Rapid Router/helper requests and patched two remaining inconsistencies: removed legacy `Column focus`/`Copy CSV` controls from remaining table-reader path and unified generic compare label to `Device details` | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx`, `backend/app/knowledgebase/core.py`; `npm --prefix frontend run build` -> success; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `45 passed`; `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py` -> `88 passed` | |
| | D-157 | Added global floating Router helper across all tabs, moved/kept Rapid Router Find+Filter in right rail above Order status, enforced runtime HF visibility flags for admin/command-palette/system-status, and preserved `Activation verification` as top/default configuration option | 2026-02-26 | `frontend/src/components/FloatingRouterHelper.tsx`, `frontend/src/App.tsx`, `frontend/src/components/BrandHeader.tsx`, `frontend/src/pages/RapidRouter.tsx`, `backend/app/main.py`; `npm --prefix frontend run build` -> success; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `45 passed`; `python3 -m pytest -q backend/app/test_unified_kb_core.py backend/app/test_knowledgebase_api.py` -> `88 passed` | |
| | D-156 | Hardened Auth0 token finalization flow to prevent premature timeout/config errors during active token setup; added preferred-audience persistence/rotation and timeout-specific guidance | 2026-02-26 | `frontend/src/auth/AuthGate.tsx`; `npm --prefix frontend run build` -> success; `cd frontend && npx vitest run src/auth/config.test.ts src/auth/errorUtils.test.ts` -> `12 passed`; `python3 -m pytest -q backend/app/test_auth.py` -> `20 passed` | |
| | D-155 | Published requested checkpoint commit/push for current CAPTCHA + Rapid Router UX simplification workspace state to both remotes | 2026-02-26 | Commit/push executed on `main` after green `npm --prefix frontend run build` and targeted `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` | |
| | D-154 | Executed Rapid Router 10-point UX refactor in 3 phases: compact 5-step header, staged review/submit actions, selection-first cards, reduced action noise, single persistent fix list, helper readability upgrades, admin moved to modal, and UX acceptance targets | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx`; `npm --prefix frontend run build` -> success; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `44 passed, 9 warnings` | |
| | D-153 | Implemented scoped CAPTCHA challenge/verify flow and enforced it on Rapid Router submit + Knowledgebase/POTS message endpoints, with frontend one-time gate UI and token-expiry recovery | 2026-02-26 | `backend/app/main.py`, `frontend/src/utils/captchaGate.ts`, `frontend/src/components/CaptchaGateCard.tsx`, `frontend/src/pages/UnifiedKnowledgebase.tsx`, `frontend/src/pages/PotsAssistant.tsx`, `frontend/src/pages/RapidRouter.tsx`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py backend/app/test_knowledgebase_api.py backend/app/test_chat_guidance_api.py backend/app/rapid_router/test_rapid_router_core.py` -> `57 passed, 9 warnings`; `npm --prefix frontend run build` -> success | |
| | D-152 | Produced critical UX/readability audit for Rapid Router and defined a prioritized 10-point simplification game plan | 2026-02-26 | Audit reviewed key dense zones in `frontend/src/pages/RapidRouter.tsx` (catalog controls, product cards, right rail, order/options/validation); plan captured in current turn response and tracked as `T-067` | |
| | D-151 | Aligned Rapid Router product-card quantity and subtotal controls to a shared bottom baseline using flex anchoring + stabilized placeholder spacing | 2026-02-26 | `frontend/src/pages/RapidRouter.tsx` (card `h-full` flex-column, `mt-auto` quantity/pricing region, backup plan-code placeholder, shipping-note `min-h`); `cd frontend && npm run build` -> success | |
| | D-150 | Investigated Space boot/wake latency and removed avoidable Rapid Router restart overhead by skipping seed-product rebuild when no seeded-ID backfill is needed | 2026-02-25 | `backend/app/rapid_router/core.py` (missing-ID gate around `_seed_products()`), `backend/app/rapid_router/test_rapid_router_core.py` (new guard test); `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `20 passed`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py` -> `23 passed`; startup probes show second-run startup `2882ms` vs first-run `6367ms` in persisted-storage scenario | |
| | D-149 | Prepared/published FAQ helper routing fix checkpoint to both required remotes on request | 2026-02-25 | Publish scope: `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`, `docs/dev/*`, `docs/faq/FAQ_ongoing_candidates.csv`; pre-publish validation: `python3 -m pytest -q backend/app/test_unified_kb_core.py` (`81 passed`), `python3 -m pytest -q backend/app/test_knowledgebase_api.py` (`7 passed, 9 warnings`) | |
| | D-148 | Fixed Rapid Router helper FAQ access so generic concept asks (e.g., network slicing) prioritize FAQ fast-lane with FAQ citation while preserving selected-model compare flows | 2026-02-25 | `backend/app/knowledgebase/core.py` (FAQ query context stripping + router-doc FAQ-first branch for RR helper generic asks), `backend/app/test_unified_kb_core.py` (stronger regression assertions for FAQ route/citation); `python3 -m pytest -q backend/app/test_unified_kb_core.py` -> `81 passed`; `python3 -m pytest -q backend/app/test_knowledgebase_api.py` -> `7 passed, 9 warnings` | |
| | D-147 | Converted `Shipping / Configuration / Payment` order-options columns into separate bubble cards for consistent visual grouping | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx` (`rr-order-options` three-column wrappers updated to rounded bordered panels); `cd frontend && npm run build` -> success | |
| | D-146 | Normalized Rapid Router card alignment by reserving fixed document and setup-note spacing when optional docs are missing | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx` (two fixed doc slots with invisible placeholders; setup-note placeholder block for absent notes); `cd frontend && npm run build` -> success | |
| | D-145 | Fixed Rapid Router address-validation suggestion truncation by using full Census matched street line with structured fallback | 2026-02-25 | `backend/app/rapid_router/core.py` (`_street_from_census_match`, `validate_us_address` mapping), `backend/app/rapid_router/test_rapid_router_core.py` (2 new regression tests); `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `42 passed, 9 warnings` | |
| | D-144 | Removed duplicate build timestamp display from header (kept single title-area build label) | 2026-02-25 | `frontend/src/components/BrandHeader.tsx` (deleted sticky-toolbar build badge); `cd frontend && npm run build` -> success | |
| | D-143 | Expanded Rapid Router helper readability (wider rail, larger typography, fuller message/table rendering) while preserving helper logic | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx` (grid rail width, helper card/chat sizing, assistant full-width bubbles, larger inline comparison-table preview and CTA); `cd frontend && npm run build` -> success | |
| | D-142 | Added PRM lead mode radios (`enter_now` vs `masters_reverse`) with conditional validation and mode-aware order outputs | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx` (radio controls, conditional PRM input/validation, payload + draft updates), `backend/app/rapid_router/core.py` (mode normalization/validation + PDF/email label rendering), `backend/app/rapid_router/test_rapid_router_core.py` (new reverse-PRM success test); `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `17 passed`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py` -> `23 passed`; `cd frontend && npm run build` -> success | |
| | D-141 | Simplified helper comparison-table controls to a single prominent `Open table reader` CTA for better visibility/selection | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx` (`HelperMarkdownTable`: removed inline expand/copy strip actions, upgraded single CTA styling, retained modal CSV copy); `cd frontend && npm run build` -> success | |
| | D-140 | Collapsed Rapid Router catalog search/filter toolbar under a default-closed accordion and kept command-focus compatibility | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx` (`catalogFiltersOpen` state, `details/summary` wrapper, `rapid_router:focus_search` auto-open); `cd frontend && npm run build` -> success | |
| | D-139 | Fixed Rapid Router helper context-intent regression so generic FAQ/concept asks are not forced into catalog fast-path tables | 2026-02-25 | `backend/app/knowledgebase/core.py` (intent detection now uses primary message; selected-context matching remains explicit), `backend/app/test_unified_kb_core.py` (new regression tests); `python3 -m pytest -q backend/app/test_unified_kb_core.py -k \"rapid_router_helper_context\"` -> `4 passed`; `python3 -m pytest -q backend/app/test_unified_kb_core.py` -> `81 passed`; `python3 -m pytest -q backend/app/test_knowledgebase_api.py` -> `7 passed` | |
| | D-138 | Triaged restart-time MuPDF `FT_New_Memory_Face` warning as non-blocking and localized source PDF (`atel_re600_manual.pdf`) | 2026-02-25 | Repro command: `python3` seed-PDF scan with PyMuPDF over `backend/app/rapid_router/seed/assets/*.pdf`; warning only on `atel_re600_manual.pdf`; extraction remained successful (`ok pages=5 chars=4261`) | |
| | D-137 | Committed and pushed all outstanding workspace changes to both required remotes on user request | 2026-02-25 | Modified-set publish including `frontend/src/App.tsx`, `backend/app/rapid_router/seed/assets/atel_w01_u.png`, `docs/dev/*`, `docs/faq/FAQ_ongoing_candidates.csv`; remotes `origin/main`, `hf-fourtab/main` | |
| | D-136 | Triaged HF env `Missing` badges and confirmed listed variables are largely optional defaults/presence diagnostics (not immediate runtime failures) | 2026-02-25 | Code review: `frontend/src/components/HealthStatusModal.tsx`, `backend/app/main.py` (`/api/health`, tab/env defaults), `backend/app/router_rag/core.py`; guidance delivered with must-set vs optional list | |
| | D-135 | Set `Rapid Router` as default landing tab by updating initial tab state, storage-key version, and default flag visibility | 2026-02-25 | `frontend/src/App.tsx`; `cd frontend && npm run build`; `cd frontend && npx vitest run --pool=threads --maxWorkers=1` (`18 files`, `54 tests`, all passed) | |
| | D-134 | Hid `Master’s AI` and `POTS Replacement Q&A` toolbox tabs from UI while keeping underlying code paths intact | 2026-02-25 | `frontend/src/App.tsx`; `cd frontend && npm run build`; `cd frontend && npx vitest run --pool=threads --maxWorkers=1` (`18 files`, `54 tests`, all passed) | |
| | D-133 | Replaced incorrect `ATEL W01-U` seed photo with corrected device image and verified Rapid Router core regression suite | 2026-02-25 | `backend/app/rapid_router/seed/assets/atel_w01_u.png`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `16 passed, 6 warnings` | |
| | D-132 | Prepared and published Rapid Router helper accessibility/table-reader fix bundle checkpoint to both required remotes | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx`, `docs/dev/*`; push targets: `origin/main`, `hf-fourtab/main` | |
| | D-131 | Fixed helper rail accessibility by activating two-column/sticky behavior at `lg` and ordering rail before main form on single-column layouts | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build`; `cd frontend && npx vitest run --pool=threads --maxWorkers=1` (`18 files`, `54 tests`, all passed) | |
| | D-130 | Added helper table-reader `Column focus` dropdown with per-column show/hide (first column pinned) for wide comparison-table readability | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build`; `cd frontend && npx vitest run --pool=threads --maxWorkers=1` (`18 files`, `54 tests`, all passed) | |
| | D-129 | Fixed Rapid Router helper comparison-table usability: always-visible `Open table reader`, stronger inline expand behavior, and sticky first-column/header context | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build`; `cd frontend && npx vitest run --pool=threads --maxWorkers=1` (`18 files`, `54 tests`, all passed) | |
| | D-128 | Reordered Rapid Router right rail (`Router helper` above `Order status`) and de-cluttered both cards with shorter copy/chips/actions | 2026-02-25 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-127 | Committed and pushed startup integrity FAQ/router CSV fix bundle to both required remotes | 2026-02-25 | commit `914699f`; `git push origin main` -> `13886dc..914699f`; `git push hf-fourtab main` -> `13886dc..914699f` | |
| | D-126 | Published operator runbook for Docker rebuild/redeploy and post-deploy browser cache reset after hashed-asset 404 | 2026-02-25 | Guidance delivered for commit/push to `origin` + `hf-fourtab`, wait for HF rebuild complete, then hard refresh/private window to clear stale bundle references | |
| | D-125 | Fixed startup integrity false alarms in Docker by hardening repo/app path resolution and packaging FAQ corpus in image | 2026-02-25 | `backend/app/knowledgebase/core.py` (root/app resolver), `Dockerfile` (`COPY docs/faq /app/docs/faq`), `backend/app/test_unified_kb_core.py` (new resolver tests); `python3 -m pytest -q backend/app/test_unified_kb_core.py` -> `79 passed`; startup integrity probe: `faq_entries=551`, `router_fact_csv_count=3`, `warnings=[]` | |
| | D-124 | Hardened Auth0 login finalization against silent token timeout and added explicit `offline_access` scopes | 2026-02-25 | `frontend/src/main.tsx`, `frontend/src/auth/AuthGate.tsx`; `cd frontend && npm run build` succeeded; `cd frontend && npx vitest run --pool=threads --maxWorkers=1` -> `18 passed` | |
| | D-123 | Delivered transfer-oriented one-to-two-page architecture/stack summary for incoming project owner | 2026-02-25 | Summary prepared from current repo state and ops docs (`README.md`, `backend/app/main.py`, `docs/dev/open_tasks.md`, workflow files) | |
| | D-122 | Committed and pushed Rapid Router eval25 suite + run-record docs checkpoint to both required remotes | 2026-02-25 | commit `ce1860a`; `git push origin main` -> `7cbce22..ce1860a`; `git push hf-fourtab main` -> `7cbce22..ce1860a` | |
| | D-121 | Diagnosed shard `1-5` failure in Rapid Router eval25 as MSRP-omission semantic miss on W1850 clarify prompt | 2026-02-25 | `jq '.results[] | select(.id==3)' docs/evals/shards5_rapidrouter25/unified_kb_eval150_1_5.json` -> `pass=false`, `final_score=81.8`, `retrieval_mode=deterministic_router_price_clarify_fast`; semantic issues show clarification-only response without requested MSRP payload | |
| | D-120 | Created Rapid Router-focused 25-case eval suite and executed shard-5 run | 2026-02-25 | Added `docs/evals/unified_kb_eval25_rapid_router_cases.json`; `cd backend && CHUNK_SIZE=5 START_ID=1 END_ID=25 CASE_TIMEOUT_S=30 OPENAI_MODEL=gpt-5.2 CASES_PATH=../docs/evals/unified_kb_eval25_rapid_router_cases.json OUT_DIR=../docs/evals/shards5_rapidrouter25 TREND_FILE=../docs/evals/shards5_rapidrouter25/unified_kb_eval25_rapidrouter_trend.json ./scripts/run_unified_kb_eval150_chunks.sh` -> `24/25`, failed IDs `[3]`, avg `23.31ms`, p95 `30.33ms` | |
| | D-119 | Re-ran full sharded suites on demand and refreshed live metrics for current baseline reporting | 2026-02-25 | `cd backend && CHUNK_SIZE=10 START_ID=1 END_ID=150 OPENAI_MODEL=gpt-5.2 ./scripts/run_unified_kb_eval150_chunks.sh` -> `150/150`, avg `900.47ms`, p95 `6316.81ms`; `cd backend && CHUNK_SIZE=5 START_ID=1 END_ID=75 CASE_TIMEOUT_S=30 OPENAI_MODEL=gpt-5.2 CASES_PATH=../docs/evals/unified_kb_eval75_msrp_verizon_cases.json OUT_DIR=../docs/evals/shards5_eval75 TREND_FILE=../docs/evals/shards5_eval75/unified_kb_eval75_trend.json ./scripts/run_unified_kb_eval150_chunks.sh` -> `74/75` (failed IDs `[75]`), avg `200.59ms`, p95 `465.47ms` | |
| | D-118 | Re-ran all sharded Unified KB suites (150 + 75) and captured updated aggregate baselines | 2026-02-25 | `cd backend && CHUNK_SIZE=10 START_ID=1 END_ID=150 OPENAI_MODEL=gpt-5.2 ./scripts/run_unified_kb_eval150_chunks.sh` -> `150/150`; `cd backend && CHUNK_SIZE=5 START_ID=1 END_ID=75 CASE_TIMEOUT_S=30 OPENAI_MODEL=gpt-5.2 CASES_PATH=../docs/evals/unified_kb_eval75_msrp_verizon_cases.json OUT_DIR=../docs/evals/shards5_eval75 TREND_FILE=../docs/evals/shards5_eval75/unified_kb_eval75_trend.json ./scripts/run_unified_kb_eval150_chunks.sh` -> `74/75` (failed IDs `[75]`) | |
| | D-117 | Triaged Rapid Router test warnings as non-blocking and captured warning-hygiene follow-up task | 2026-02-25 | `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `39 passed, 9 warnings`; sources: `reportlab` deprecation + SWIG/PyMuPDF import warnings | |
| | D-116 | Committed and pushed current CR602 + T-059 + router alias-normalization batch to both required remotes | 2026-02-25 | commit `b87d5d7`; `git push origin main` -> `8d77217..b87d5d7`; `git push hf-fourtab main` -> `8d77217..b87d5d7` | |
| | D-115 | Added deterministic router model alias normalization for hyphen/punctuation variants (`MAX-BR1-PRO-5G`, `XR_60`) and regression coverage | 2026-02-25 | `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`; `python3 -m pytest -q backend/app/test_unified_kb_core.py` -> `77 passed` | |
| | D-114 | Implemented `T-059` Rapid Router CSV ingestion validator + dry-run preview/apply admin path with schema/lint checks and duplicate ID/SKU protection | 2026-02-25 | `backend/app/rapid_router/core.py`, `backend/app/main.py`, `backend/app/rapid_router/test_rapid_router_core.py`, `backend/app/test_rapid_router_api_shell.py`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py backend/app/test_rapid_router_api_shell.py` -> `39 passed` | |
| | D-113 | Prepared detailed new-thread bootstrap prompt aligned to AGENTS + required dev docs + live working tree state | 2026-02-25 | Prompt package delivered in chat; continuity anchors: `docs/dev/session_handoff.md`, `docs/dev/decisions.md`, `docs/dev/open_tasks.md` | |
| | D-112 | Produced ranked 20-item update backlog with complexity/value/risk scoring and identified top 5 implementation targets | 2026-02-25 | Planning output delivered in chat; promoted top-5 into `T-057`, `T-059`, `T-060`, `T-061`, `T-062` | |
| | D-111 | Added `InHand Networks CR602` to Rapid Router seeded catalog with bundled datasheet/manual/image assets and regression assertions | 2026-02-25 | `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`, `backend/app/rapid_router/seed/assets/inhand_cr602.png`, `backend/app/rapid_router/seed/assets/inhand_cr602_datasheet.pdf`, `backend/app/rapid_router/seed/assets/inhand_cr602_user_manual.pdf`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `13 passed` | |
| | D-110 | Committed and pushed helper non-store fallback fix checkpoint to both remotes | 2026-02-24 | commit `df60837`; `git push origin main` -> `8f805fb..df60837`; `git push hf-fourtab main` -> `8f805fb..df60837` | |
| | D-109 | Fixed Rapid Router helper model-compare fallback: explicit non-store model asks now bypass store fast path and fall back to standard router-doc compare/spec logic with non-orderable notice | 2026-02-24 | `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py`; `cd backend && python3 -m pytest -q app/test_unified_kb_core.py app/test_knowledgebase_api.py app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py` -> `117 passed` | |
| | D-108 | Committed and pushed T-058 + Rapid Router BoBo/PRM validation hardening checkpoint to both remotes | 2026-02-24 | commit `7a884c8`; `git push origin main` -> `7215527..7a884c8`; `git push hf-fourtab main` -> `7215527..7a884c8` | |
| | D-107 | Enforced strict PRM Lead format (`EL-` + exactly 7 digits) with fixed-prefix UI control, backend validation, admin-config validation, and store migration | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`, `backend/app/test_rapid_router_api_shell.py`, `backend/app/test_tab_final_pass_matrix.py`; `python3 -m py_compile backend/app/rapid_router/core.py`; `cd frontend && npm run build`; `cd backend && python3 -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py app/test_tab_final_pass_matrix.py` -> `38 passed` | |
| | D-106 | Added BoBo-conditional required payment fields (`Company Name`, `SPOC`, `ECPD/VZ Account Number`) across Rapid Router UI + submit validation + persisted order/email/PDF outputs | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`, `backend/app/test_tab_final_pass_matrix.py`; `cd backend && python3 -m pytest -q app/rapid_router/test_rapid_router_core.py app/test_tab_final_pass_matrix.py` -> `16 passed`; `cd backend && python3 -m pytest -q app/test_rapid_router_api_shell.py` -> `21 passed`; `cd frontend && npm run build` passed | |
| | D-105 | Implemented `T-058` Rapid Router catalog integration into Unified Knowledgebase (`router_docs`) with provider injection, deterministic catalog fast paths (list/price/feature/compare), cache-fingerprint wiring, and fallback-to-router-fact behavior | 2026-02-24 | `backend/app/knowledgebase/core.py`, `backend/app/main.py`, `backend/app/test_unified_kb_core.py`, `backend/app/test_knowledgebase_api.py`; `cd backend && python3 -m pytest -q app/test_unified_kb_core.py app/test_knowledgebase_api.py app/rapid_router/test_rapid_router_core.py` -> `92 passed`; manual API check via `TestClient` confirmed retrieval mode `deterministic_rapid_router_catalog_list_fast` | |
| | D-104 | Added full-screen comparison-table reader for Rapid Router helper messages (with sticky table headers, improved cell readability, and CSV copy retained) | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-103 | Committed and pushed current Rapid Router + auth stabilization working tree to both remotes | 2026-02-24 | commit `44c021b`; `git push origin main` and `git push hf-fourtab main` succeeded | |
| | D-102 | Updated auth smoke Playwright helper to skip quickly in non-auth local runtime (eliminated false 60s timeouts) | 2026-02-24 | `frontend/e2e/auth.spec.ts`; `cd frontend && E2E_DISABLE_WEBSERVER=true E2E_BASE_URL=http://127.0.0.1:7860 npx playwright test e2e/auth.spec.ts` -> `6 skipped` | |
| | D-101 | Fixed AuthGate refresh-token recovery-flag lifecycle for deterministic re-login/consent recovery behavior | 2026-02-24 | `frontend/src/auth/AuthGate.tsx`; `cd frontend && npm run build`; `cd frontend && npx vitest run --pool=threads --maxWorkers=1` -> `18 passed` | |
| | D-100 | Hardened AuthGate timeout env parsing (`VITE_AUTH_FINALIZING_WATCHDOG_MS`, `VITE_AUTH_SILENT_TIMEOUT_MS`) against quoted/malformed values | 2026-02-24 | `frontend/src/auth/AuthGate.tsx`; `cd frontend && npm run build`; `python3 -m pytest -q backend/app/test_auth.py backend/app/test_rapid_router_api_shell.py backend/app/rapid_router/test_rapid_router_core.py` -> `52 passed` | |
| | D-099 | Implemented full Rapid Router UX cleanup bundle (compact order rail, completion chips, jump-to-error links, card/table toggle, shipping indicators, collapsed sections, review modal, helper CSV copy/spacing, session-draft badges, mobile sticky CTA) | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `11 passed`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py` -> `21 passed` | |
| | D-098 | Prepared prioritized UI/UX improvement recommendations for Rapid Router/toolbox cleanup | 2026-02-24 | Recommendation package delivered in chat; implementation task tracked as `T-056` | |
| | D-097 | Added search-driven auto-expand behavior for collapsed Support Toolbox | 2026-02-24 | `frontend/src/App.tsx`; `cd frontend && npm run build` passed | |
| | D-096 | Collapsed Support Toolbox cards behind a closed-by-default accordion toggle (`Open toolbox` / `Hide toolbox`) | 2026-02-24 | `frontend/src/App.tsx`; `cd frontend && npm run build` passed | |
| | D-095 | Made `Ordering assistant` and `Router selection helper` follow together as a single sticky right-column block | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-094 | Updated ground shipping policy to `$9.99` with automatic waiver for Standard FWA items; added migration and order/PDF/email shipping breakdown fields | 2026-02-24 | `backend/app/rapid_router/core.py`, `frontend/src/pages/RapidRouter.tsx`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `11 passed`; `python3 -m pytest -q backend/app/test_rapid_router_api_shell.py` -> `21 passed`; `cd frontend && npm run build` passed | |
| | D-093 | Updated `Peplink MAX BR1 Pro 5G` MSRP to `$999.00` and added startup migration to correct stale/null persisted MSRP values | 2026-02-24 | `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `10 passed` | |
| | D-092 | Fixed Router selection helper table rendering by switching assistant bubbles to markdown + added per-table expand/collapse UI | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-091 | Sorted routers by `price_primary` ascending within each `4G` and `5G` section on Rapid Router | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-090 | Grouped Rapid Router product catalog into distinct `4G` then `5G` sections with visual differentiation | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-089 | Committed and pushed Rapid Router reload-reset behavior to both required remotes | 2026-02-24 | commit `a469363`; pushed `origin/main` and `hf-fourtab/main` | |
| | D-088 | Changed Rapid Router draft persistence to in-memory only so full reload clears quantities/details while in-app tab switches retain state | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-087 | Verified `ATEL RE600 (Black)` image was already correct (no-op fix) | 2026-02-24 | `backend/app/rapid_router/seed/assets/atel_re600_black.png`; SHA-256 matched `Screenshot 2026-02-24 at 11.13.41 AM.png` | |
| | D-086 | Corrected `Inseego Wavemaker FX4210` card image (replaced mismatched seed asset with proper Inseego device art) | 2026-02-24 | `backend/app/rapid_router/seed/assets/inseego_wavemaker_fx4210.png`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `9 passed` | |
| | D-085 | Corrected swapped ATEL image assignments: `V810AD` now uses single-antenna tabletop image and `RE600` uses multi-antenna image | 2026-02-24 | `backend/app/rapid_router/seed/assets/atel_v810ad.png`, `backend/app/rapid_router/seed/assets/atel_re600_black.png`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `9 passed` | |
| | D-084 | Applied ATEL W01-U image hotfix as explicit seed-asset rewrite to corrected user-provided image | 2026-02-24 | `backend/app/rapid_router/seed/assets/atel_w01_u.png`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `9 passed` | |
| | D-083 | Prepared single-commit deployment package for Rapid Router new-device expansion (catalog + assets + migration + tests + template) and executed commit/push workflow | 2026-02-24 | `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`, `backend/app/rapid_router/seed/assets/*`, `docs/templates/rapid_router_new_devices_upload_template.csv` | |
| | D-082 | Replaced new Rapid Router device photos with exact user-supplied attachment images and enabled startup refresh for existing runtime copies | 2026-02-24 | `backend/app/rapid_router/seed/assets/peplink_b_one_5g.png`, `backend/app/rapid_router/seed/assets/atel_w01_u.png`, `backend/app/rapid_router/seed/assets/atel_pw550.png`, `backend/app/rapid_router/seed/assets/atel_re600_black.png`, `backend/app/rapid_router/seed/assets/atel_v810ad.png`, `backend/app/rapid_router/seed/assets/atel_v810vd_bp.png`, `backend/app/rapid_router/seed/assets/inseego_wavemaker_fx4210.png`, `backend/app/rapid_router/core.py`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `9 passed` | |
| | D-081 | Added 7 new Rapid Router devices (Peplink B One 5G, ATEL W01-U/PW550/RE600/V810AD/V810VD-BP, Inseego Wavemaker FX4210) with seeded assets and automatic backfill for existing stores | 2026-02-24 | `backend/app/rapid_router/core.py`, `backend/app/rapid_router/seed/assets/*`, `backend/app/rapid_router/test_rapid_router_core.py`; `python3 -m pytest -q backend/app/rapid_router/test_rapid_router_core.py` -> `9 passed` | |
| | D-073 | Simplified Rapid Router Ordering Assistant to compact status card with fewer actions and reduced visual complexity | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-074 | Added Rapid Router MSRP support (store schema + product cards + admin add-product MSRP input) | 2026-02-24 | `backend/app/rapid_router/core.py`, `backend/app/main.py`, `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-075 | Added workbook-backed required Masters contact dropdown and routed selected contact into order-email `To` recipients | 2026-02-24 | `backend/app/rapid_router/seed/masters_contacts.xlsx`, `backend/app/rapid_router/core.py`, `frontend/src/pages/RapidRouter.tsx`; `cd backend && python3 -m pytest app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py app/test_tab_final_pass_matrix.py -q` -> `29 passed` | |
| | D-076 | Added required Configuration Options with advanced-task validation and per-router pricing rolled into totals/PDF/email | 2026-02-24 | `backend/app/rapid_router/core.py`, `frontend/src/pages/RapidRouter.tsx`; `cd backend && python3 -m pytest app/rapid_router/test_rapid_router_core.py app/test_rapid_router_api_shell.py app/test_tab_final_pass_matrix.py -q` -> `29 passed` | |
| | D-077 | Committed and pushed Rapid Router MSRP/contact/configuration expansion to both required remotes | 2026-02-24 | commit `176ff8f`; pushed `origin/main` and `hf-fourtab/main` | |
| | D-078 | Remapped `Peplink MAX BR1 Pro 5G` to use the current `MAX BR1 Mini (Wi-Fi)` photo and added startup migration for persisted stores | 2026-02-24 | `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `cd backend && python3 -m pytest app/rapid_router/test_rapid_router_core.py -q` -> `7 passed` | |
| | D-079 | Replaced `MAX BR1 Mini (Wi-Fi)` product image with requested photo and enforced startup refresh for existing runtime assets | 2026-02-24 | `backend/app/rapid_router/seed/assets/peplink_br1_mini_5g_wifi.png`, `backend/app/rapid_router/core.py`, `backend/app/rapid_router/test_rapid_router_core.py`; `cd backend && python3 -m pytest app/rapid_router/test_rapid_router_core.py -q` -> `8 passed` | |
| | D-080 | Added reusable CSV template for Rapid Router new-device intake with MSRP and pricing columns | 2026-02-24 | `docs/templates/rapid_router_new_devices_upload_template.csv` | |
| | D-072 | Fixed mobile overlap where `Ordering Assistant` covered `Router selection helper` by making side panel sticky only at `lg+` breakpoints | 2026-02-24 | `frontend/src/components/ConversationalSidePanel.tsx`; `cd frontend && npm run build` passed | |
| | D-071 | Committed and pushed Rapid Router helper chatbot fast-path to both required remotes | 2026-02-24 | commit `6c6f7dc`; pushed `origin/main` and `hf-fourtab/main` | |
| | D-070 | Implemented Rapid Router in-page helper chatbot using existing knowledgebase endpoint in `router_docs` mode | 2026-02-24 | `frontend/src/pages/RapidRouter.tsx`; `cd frontend && npm run build` passed | |
| | D-063 | Committed and pushed item 1-5 changes + eval150 rerun results to both required remotes | 2026-02-24 | commit `54a654c`; pushed `origin/main` and `hf-fourtab/main` | |
| | D-062 | Re-ran full unified eval150 (shards10, OpenAI semantic) after implementing items 1-5; achieved zero failures | 2026-02-24 | `docs/evals/shards10/unified_kb_eval150_shards10_summary.json` (`150/150`, `100.0%`, failed IDs `[]`) | |
| | D-061 | Added resilient local `/tmp` corpus staging in shard runner with manifest fallback generation from chunks | 2026-02-24 | `backend/scripts/run_unified_kb_eval150_chunks.sh` (`ROUTER_RAG_DATA_DIR=/tmp/router_rag_eval_stage/...` confirmed in run logs) | |
| | D-060 | Added Router RAG fingerprint modes (`strict`/`hybrid`/`metadata`) with timeout-safe hash fallback | 2026-02-24 | `backend/app/router_rag/index.py`, `backend/app/test_router_rag_module.py` (`47 passed`) | |
| | D-059 | Added safe `.env.codex` loading fallback and optional single-process shard execution mode | 2026-02-24 | `backend/scripts/run_unified_kb_eval150_chunks.sh` (`load_env_file_safe`, `SINGLE_PROCESS_SHARDS`) | |
| | D-058 | Re-ran unified eval150 in 10-question shards with semantic grading and published updated summary | 2026-02-24 | `docs/evals/shards10/unified_kb_eval150_shards10_summary.json` (`126/150`, `84.0%`, failed IDs include `2,3,39-58,116,118`) | |
| | D-055 | Fixed `CBA850` token-only routing from weak router-docs path to deterministic lifecycle output | 2026-02-20 | `backend/app/knowledgebase/core.py` (`_single_lifecycle_only_model_token`, auto-mode routing + router-docs bridge), `backend/app/test_unified_kb_core.py` | |
| | D-056 | Added regression tests for lifecycle-only single-token routing in `router_docs` and `auto` | 2026-02-20 | `cd backend && python3 -m pytest -q app/test_unified_kb_core.py` -> `70 passed` | |
| | D-057 | Full backend regression passed after CBA850 routing fix | 2026-02-20 | `cd backend && python3 -m pytest -q` -> `316 passed, 9 warnings` | |
| | D-054 | Committed and pushed deep-analysis hardening patch to GitHub and HF four-tab remote | 2026-02-20 | commit `f1e0811`; pushed `origin/main` and `hf-fourtab/main` | |
| | D-051 | Patched web fallback timeout budgeting to respect remaining request budget | 2026-02-20 | `backend/app/knowledgebase/core.py` (`_web_fallback` remaining budget guard + timeout cap) | |
| | D-052 | Hardened parallel index search against stale/shutdown shared executor and added recovery test | 2026-02-20 | `backend/app/knowledgebase/core.py`, `backend/app/test_unified_kb_core.py` | |
| | D-053 | Deep-analysis verification cycle passed after hardening patches | 2026-02-20 | `cd backend && python3 -m pytest -q` -> `314 passed, 9 warnings`; `cd backend && python3 -m pytest -q app/test_unified_kb_core.py` -> `68 passed` | |
| | D-050 | Finalized and pushed enhancement batch commit to both remotes | 2026-02-20 | commit `925b963`; pushed `origin/main` and `hf-fourtab/main` | |
| | D-044 | Implemented targeted fail-ID fixes for masters FAQ clarify over-trigger (`102`,`108`) and POTS `top-10` objection parsing (`63`) | 2026-02-20 | `backend/app/knowledgebase/core.py`; targeted reruns in `docs/evals/shards1_target_102_108/` and `docs/evals/shards1_target_75_id63/` | |
| | D-045 | Added stage-budget-exit telemetry and retrieval-mode tracking to eval payloads/summaries | 2026-02-20 | `backend/scripts/unified_kb_eval150.py` | |
| | D-046 | Added runner profile toggle + explicit commit-gate fields (`no_new_failed_ids`, `p95_non_regression`) and non-persistent FAQ churn policy by default | 2026-02-20 | `backend/scripts/run_unified_kb_eval150_chunks.sh` | |
| | D-047 | Added regression tests for FAQ medium-confidence bypass and hyphenated `top-10` objection handling | 2026-02-20 | `backend/app/test_unified_kb_core.py`, `backend/app/test_unified_kb_eval150_script.py` | |
| | D-048 | Full backend regression passed after enhancement batch | 2026-02-20 | `cd backend && python3 -m pytest -q` -> `312 passed, 9 warnings` | |
| | D-049 | Full OpenAI shard reruns completed (v3) | 2026-02-20 | `docs/evals/shards5_150_balanced_v3/unified_kb_eval150_shards10_summary.json` (`150/150`), `docs/evals/shards5_75_balanced_v3/unified_kb_eval150_shards10_summary.json` (`74/75`, fail `3`) | |
| | D-043 | Logged pre-commit low-risk enhancement shortlist and translated into active tasks | 2026-02-20 | `docs/dev/decisions.md`, `docs/dev/open_tasks.md` | |
| | D-037 | Implemented balanced-profile token/perf caps in router web fallback + POTS synthesis + semantic grader defaults | 2026-02-20 | `backend/app/router_rag/core.py`, `backend/app/pots_ai/core.py`, `backend/scripts/unified_kb_eval150.py`, `backend/scripts/run_unified_kb_eval150_chunks.sh` | |
| | D-038 | Applied OpenAI compatibility fix for POTS completions cap (`max_completion_tokens`) | 2026-02-20 | `backend/app/pots_ai/core.py` | |
| | D-039 | Clean 150-case rerun (balanced-v2) completed | 2026-02-20 | `docs/evals/shards5_150_balanced_v2/unified_kb_eval150_shards10_summary.json` (`148/150`, fails `102,108`) | |
| | D-040 | Clean 75-case rerun (balanced-v2) completed | 2026-02-20 | `docs/evals/shards5_75_balanced_v2/unified_kb_eval150_shards10_summary.json` (`74/75`, fails `63`) | |
| | D-041 | Full backend regression passed after balanced-v2 changes | 2026-02-20 | `cd backend && python3 -m pytest -q` -> `308 passed, 9 warnings` | |
| | D-042 | Before/after comparison package prepared for commit-gate decision | 2026-02-20 | `docs/dev/session_handoff.md`, `docs/dev/decisions.md` | |
| | D-019 | Full backend deep-dive regression run passed | 2026-02-20 | `python3 -m pytest -q` -> `299 passed` | |
| | D-020 | Patched shared bounded retrieval executor path in unified KB | 2026-02-20 | `backend/app/knowledgebase/core.py` | |
| | D-021 | Added runtime health assertion for parallel-search executor flags | 2026-02-20 | `backend/app/test_unified_kb_core.py` | |
| | D-022 | Shard runner now defaults trend and FAQ ongoing paths to `OUT_DIR` | 2026-02-20 | `backend/scripts/run_unified_kb_eval150_chunks.sh` | |
| | D-023 | Post-patch shard smoke run passed and wrote trend in smoke out-dir | 2026-02-20 | `docs/evals/shards10_deepdive_smoke/unified_kb_eval150_shards10_summary.json` | |
| | D-024 | 150-case semantic rerun (shard-5, 30s timeout) completed | 2026-02-20 | `docs/evals/shards5_150_rerun/unified_kb_eval150_shards10_summary.json` (`146/150`) | |
| | D-025 | 75-case MSRP/Verizon semantic rerun (shard-5, 30s timeout) completed | 2026-02-20 | `docs/evals/shards5_75_rerun/unified_kb_eval150_shards10_summary.json` (`74/75`) | |
| | D-026 | Post-rerun full backend regression passed | 2026-02-20 | `cd backend && python3 -m pytest -q` -> `299 passed` | |
| | D-027 | Device comparison table schema updated to user-locked format with hidden evidence column | 2026-02-20 | `backend/app/knowledgebase/core.py` | |
| | D-028 | Comparison schema regression tests added and passing | 2026-02-20 | `cd backend && python3 -m pytest -q app/test_unified_kb_core.py` -> `56 passed` | |
| | D-029 | Implemented full guarded 10-suggestion patch set (core + eval tooling + tests) | 2026-02-20 | `backend/app/knowledgebase/core.py`, `backend/scripts/unified_kb_eval150.py`, `backend/scripts/run_unified_kb_eval150_chunks.sh`, `backend/app/test_unified_kb_core.py` | |
| | D-030 | Full backend regression passed after patch set | 2026-02-20 | `cd backend && python3 -m pytest -q` -> `308 passed` | |
| | D-031 | 150-case shard-5 semantic rerun completed post-patch | 2026-02-20 | `docs/evals/shards10/unified_kb_eval150_shards10_summary.json` (`144/150`, fails `7,86,90,102,108,129`) | |
| | D-032 | 75-case MSRP/Verizon shard-5 semantic rerun completed post-patch | 2026-02-20 | `docs/evals/shards5_eval75/unified_kb_eval75_shards5_summary.json` (`74/75`, fails `3`) | |
| | D-033 | Current batch committed and pushed to GitHub + HF four-tab remote | 2026-02-20 | commit `9e5a3bd`; pushed `origin/main` and `hf-fourtab/main` | |
| | D-034 | OpenAI token-usage hotspot analysis completed (no-code step) | 2026-02-20 | Reviewed `backend/scripts/unified_kb_eval150.py`, `backend/app/pots_ai/core.py`, `backend/app/router_rag/core.py` | |
| | D-035 | Token-optimization actions ranked by difficulty/perf/token impact and rollout priority | 2026-02-20 | User-facing ranked matrix prepared; order captured in `docs/dev/decisions.md` | |
| | D-036 | Balanced profile recommendation published (performance vs quality) | 2026-02-20 | Decision logged in `docs/dev/decisions.md`; task added as `T-026` | |
|
|
| ## Standard Verification Commands |
| ```bash |
| # Full backend regression |
| cd backend |
| python3 -m pytest -q |
| |
| # Deep-dive runner smoke |
| cd backend |
| CHUNK_SIZE=1 START_ID=1 END_ID=1 OUT_DIR=../docs/evals/shards10_deepdive_smoke \ |
| SHARD_WORKERS=1 OPENAI_MODEL=gpt-5.2 ./scripts/run_unified_kb_eval150_chunks.sh |
| ``` |
|
|