rohitsar567 Claude Opus 4.7 (1M context) commited on
Commit
dfaa4d6
·
1 Parent(s): 4bb66dd

feat(upload): wait-for-extraction flow + status endpoint + extended voice guard

Browse files

User directives integrated:
1. "generate the card inline ONLY after full data extraction" → card
no longer renders on the partial heuristic record; we wait for
the LLM pass to complete first.
2. "ensure re-grading actually happens" → new
/api/upload/extraction-status/{policy_id} exposes real-time
in-memory state of the background LLM extraction.
3. "kill the unprompted 'Could you please upload' message" → voice
auto-submit now blocked for the ENTIRE extraction window
(extractionInFlight state), not just the 8s upload-status window.
4. "same level of usability as the 148 catalogued, no less" → no
user-upload__ branches in PolicyPremiumWidget /
PerPolicyPremiumEstimator — parity by construction once LLM
extraction lands and HealthPolicy fields populate.

BACKEND
─────────────────────────────────────────────────────────────────────
- _UPLOAD_EXTRACTION_STATUS dict (in-process) tracks per-upload
extraction lifecycle: pending → running → complete | failed.
- extract_one_for_upload writes status at every phase + the final
completeness_pct + overall_grade it computed.
- New GET /api/upload/extraction-status/{policy_id} returns the
ExtractionStatusResponse for the frontend's poll loop. Unknown
policy_id returns status="unknown" so the client can stop polling.
- Upload endpoint pre-stamps "pending" BEFORE firing
asyncio.create_task so a fast frontend poll never races.

FRONTEND
─────────────────────────────────────────────────────────────────────
- New extractionInFlight state — true from upload start until
status="complete"|"failed" (or 120s hard timeout).
- Voice guard now: `if (uploadStatus || extractionInFlight) return`
(was: uploadStatus only — cleared in 8s while extraction kept
running 30-60s more, letting ambient sound fire an unprompted
"please upload" chat turn).
- handleFile restructured into 6 phases:
1. POST /api/upload-policy
2. push ack "Got it — I've received X. Give me a moment to read
through it fully (~30-60s) and I'll bring back a complete
picture" (NO citations → NO card render yet)
3. push choice prompt (finish profile / dive into PDF)
4. poll /api/upload/extraction-status/{id} every 3s for up to 120s
5. on status="complete" → push assistant msg WITH citations →
card renders inline at THIS point with full LLM-extracted data
6. on failed / timeout → push fallback message (no broken-state
card; user can still chat about the PDF since text is indexed)
- 3 new i18n keys (en + hi): upload.chat_ack_reading,
upload.chat_card_ready, upload.chat_extraction_failed.

VERIFY
─────────────────────────────────────────────────────────────────────
- py_compile clean
- npx tsc --noEmit clean
- Live audit deferred to deploy commit

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

backend/main.py CHANGED
@@ -1978,8 +1978,9 @@ async def upload_policy(
1978
 
1979
  # ── Fire LLM-assisted extraction in background (ADR-044) ─────────
1980
  # Same extractor as the catalogued 148. Runs ~30-60s; the upload
1981
- # HTTP response returns now and the frontend polls the scorecard
1982
- # endpoint to refresh the card in place when extraction lands.
 
1983
  # Fail-silent: a failed LLM pass leaves the heuristic record
1984
  # intact, so the card still has SOMETHING to show — never blocks
1985
  # the user. NEVER blocks this request.
@@ -1992,6 +1993,20 @@ async def upload_policy(
1992
  if _resolved_insurer_slug != _udocs.UPLOAD_INSURER_SLUG
1993
  else _udocs.UPLOAD_INSURER_NAME,
1994
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1995
  asyncio.create_task(
1996
  _udocs.extract_one_for_upload(
1997
  policy_id=policy_id,
@@ -2028,6 +2043,58 @@ async def upload_policy(
2028
  )
2029
 
2030
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2031
  class ScorecardSubScore(BaseModel):
2032
  name: str
2033
  score: int
 
1978
 
1979
  # ── Fire LLM-assisted extraction in background (ADR-044) ─────────
1980
  # Same extractor as the catalogued 148. Runs ~30-60s; the upload
1981
+ # HTTP response returns now and the frontend polls
1982
+ # /api/upload/extraction-status/{policy_id} (see below) to know
1983
+ # when the card-bearing chat message should be pushed.
1984
  # Fail-silent: a failed LLM pass leaves the heuristic record
1985
  # intact, so the card still has SOMETHING to show — never blocks
1986
  # the user. NEVER blocks this request.
 
1993
  if _resolved_insurer_slug != _udocs.UPLOAD_INSURER_SLUG
1994
  else _udocs.UPLOAD_INSURER_NAME,
1995
  )
1996
+ # Pre-stamp "pending" so a frontend poll that arrives BEFORE
1997
+ # extract_one_for_upload's first await still sees a known
1998
+ # state instead of HTTP 404.
1999
+ await _udocs._set_extraction_status(
2000
+ policy_id,
2001
+ status="pending",
2002
+ policy_name=policy_name,
2003
+ insurer_slug=_resolved_insurer_slug,
2004
+ started_at=None,
2005
+ completed_at=None,
2006
+ completeness_pct=None,
2007
+ overall_grade=None,
2008
+ error=None,
2009
+ )
2010
  asyncio.create_task(
2011
  _udocs.extract_one_for_upload(
2012
  policy_id=policy_id,
 
2043
  )
2044
 
2045
 
2046
+ # ---------------------------------------------------------------------------
2047
+ # GET /api/upload/extraction-status/{policy_id} — frontend poll target
2048
+ # (ADR-044, 2026-05-27).
2049
+ #
2050
+ # After the upload endpoint returns, the chat flow needs to know when
2051
+ # the background LLM extraction completes so it can push the card-bearing
2052
+ # assistant message into chat with the FULL data (not the heuristic
2053
+ # stub). This endpoint exposes _UPLOAD_EXTRACTION_STATUS so the
2054
+ # frontend can poll every ~3s for up to ~120s.
2055
+ # ---------------------------------------------------------------------------
2056
+
2057
+
2058
+ class ExtractionStatusResponse(BaseModel):
2059
+ policy_id: str
2060
+ status: str # "pending" | "running" | "complete" | "failed" | "unknown"
2061
+ policy_name: Optional[str] = None
2062
+ insurer_slug: Optional[str] = None
2063
+ started_at: Optional[str] = None
2064
+ completed_at: Optional[str] = None
2065
+ completeness_pct: Optional[float] = None
2066
+ overall_grade: Optional[str] = None
2067
+ error: Optional[str] = None
2068
+
2069
+
2070
+ @app.get(
2071
+ "/api/upload/extraction-status/{policy_id}",
2072
+ response_model=ExtractionStatusResponse,
2073
+ )
2074
+ async def upload_extraction_status(policy_id: str):
2075
+ """Return the live status of a per-upload LLM-assisted extraction.
2076
+
2077
+ Returns `status="unknown"` for an unrecognised policy_id (e.g. the
2078
+ frontend polled a stale id or a policy that was uploaded on a prior
2079
+ container) so the client can stop polling without ambiguity.
2080
+ """
2081
+ from backend import uploaded_docs as _udocs
2082
+ state = _udocs.get_extraction_status(policy_id)
2083
+ if not state:
2084
+ return ExtractionStatusResponse(policy_id=policy_id, status="unknown")
2085
+ return ExtractionStatusResponse(
2086
+ policy_id=policy_id,
2087
+ status=state.get("status", "unknown"),
2088
+ policy_name=state.get("policy_name"),
2089
+ insurer_slug=state.get("insurer_slug"),
2090
+ started_at=state.get("started_at"),
2091
+ completed_at=state.get("completed_at"),
2092
+ completeness_pct=state.get("completeness_pct"),
2093
+ overall_grade=state.get("overall_grade"),
2094
+ error=state.get("error"),
2095
+ )
2096
+
2097
+
2098
  class ScorecardSubScore(BaseModel):
2099
  name: str
2100
  score: int
backend/uploaded_docs.py CHANGED
@@ -728,14 +728,47 @@ async def reingest_persisted_into_policies() -> dict:
728
  # 74% median for catalogued. After this change uploaded cards land in
729
  # the same completeness band by construction.
730
  #
731
- # Runs as a background asyncio task fired from the upload endpoint
732
- # the upload's HTTP response returns immediately with the heuristic
733
- # record (sub-second), and the LLM pass (~30-60s) lands in the
734
- # background. The frontend polls /api/policies/{id}/scorecard after
735
- # the upload and refreshes the card in place when completeness jumps.
 
736
  # ---------------------------------------------------------------------------
737
 
738
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
739
  async def extract_one_for_upload(
740
  policy_id: str,
741
  pdf_path: Path,
@@ -748,10 +781,25 @@ async def extract_one_for_upload(
748
  invalidates the marketplace grade cache so the next /api/policies/all
749
  + /api/policies/{id}/scorecard call returns the LLM-graded card.
750
 
 
 
 
751
  Returns True iff a HealthPolicy was successfully extracted and written.
752
  Swallows all errors (returns False) — a failed LLM pass must NEVER
753
  affect the upload's HTTP response, which has already returned.
754
  """
 
 
 
 
 
 
 
 
 
 
 
 
755
  try:
756
  # Lazy imports — these touch the LLM client + DuckDB; we don't
757
  # want to pay that cost at module import time.
@@ -780,6 +828,11 @@ async def extract_one_for_upload(
780
  "[upload-extract] read_full_text failed %s: %s: %s",
781
  policy_id, type(e).__name__, e,
782
  )
 
 
 
 
 
783
  return False
784
 
785
  prompt = build_extract_prompt(text, schema_excerpt(), policy_id)
@@ -828,6 +881,11 @@ async def extract_one_for_upload(
828
  "[upload-extract] no policy extracted for %s after retries; "
829
  "card stays on heuristic record", policy_id,
830
  )
 
 
 
 
 
831
  return False
832
 
833
  # Write rag/extracted/<policy_id>.json — same shape as catalogued.
@@ -866,8 +924,38 @@ async def extract_one_for_upload(
866
  "[upload-extract] OK %s (extraction_confidence_pct=%s)",
867
  policy_id, getattr(policy, "extraction_confidence_pct", "n/a"),
868
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
869
  return True
870
  except Exception as e: # noqa: BLE001 — top-level catch-all
 
 
 
 
 
 
 
 
871
  _log.warning(
872
  "[upload-extract] unexpected failure for %s: %s: %s",
873
  policy_id, type(e).__name__, str(e)[:400],
 
728
  # 74% median for catalogued. After this change uploaded cards land in
729
  # the same completeness band by construction.
730
  #
731
+ # Runs as a background asyncio task fired from the upload endpoint.
732
+ # The upload's HTTP response returns immediately (sub-second) with the
733
+ # heuristic record; the LLM pass (~30-60s) lands in the background.
734
+ # A new GET /api/upload/extraction-status/{policy_id} endpoint exposes
735
+ # in-flight state to the frontend so the chat flow can wait for
736
+ # extraction → THEN render the card with full data (no partial render).
737
  # ---------------------------------------------------------------------------
738
 
739
 
740
+ # In-memory status dict — one entry per uploaded policy_id.
741
+ # Shape:
742
+ # {
743
+ # "status": "pending" | "running" | "complete" | "failed",
744
+ # "policy_id": str,
745
+ # "policy_name": str,
746
+ # "insurer_slug": str,
747
+ # "started_at": ISO-8601 UTC,
748
+ # "completed_at": ISO-8601 UTC | None,
749
+ # "completeness_pct": float | None, # populated on complete
750
+ # "overall_grade": str | None,
751
+ # "error": str | None,
752
+ # }
753
+ # Survives only the live process — fine for the UX use case (the
754
+ # frontend polls within ~120s of upload).
755
+ _UPLOAD_EXTRACTION_STATUS: dict[str, dict] = {}
756
+ _UPLOAD_EXTRACTION_LOCK = asyncio.Lock()
757
+
758
+
759
+ async def _set_extraction_status(policy_id: str, **fields) -> None:
760
+ async with _UPLOAD_EXTRACTION_LOCK:
761
+ cur = _UPLOAD_EXTRACTION_STATUS.get(policy_id, {})
762
+ cur.update(fields)
763
+ cur["policy_id"] = policy_id
764
+ _UPLOAD_EXTRACTION_STATUS[policy_id] = cur
765
+
766
+
767
+ def get_extraction_status(policy_id: str) -> Optional[dict]:
768
+ """Public read accessor used by the /api/upload/extraction-status endpoint."""
769
+ return _UPLOAD_EXTRACTION_STATUS.get(policy_id)
770
+
771
+
772
  async def extract_one_for_upload(
773
  policy_id: str,
774
  pdf_path: Path,
 
781
  invalidates the marketplace grade cache so the next /api/policies/all
782
  + /api/policies/{id}/scorecard call returns the LLM-graded card.
783
 
784
+ Status is mirrored to `_UPLOAD_EXTRACTION_STATUS[policy_id]` at every
785
+ phase change so the frontend's poll loop sees progress in real time.
786
+
787
  Returns True iff a HealthPolicy was successfully extracted and written.
788
  Swallows all errors (returns False) — a failed LLM pass must NEVER
789
  affect the upload's HTTP response, which has already returned.
790
  """
791
+ _now = lambda: time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
792
+ await _set_extraction_status(
793
+ policy_id,
794
+ status="running",
795
+ policy_name=policy_name,
796
+ insurer_slug=insurer_slug,
797
+ started_at=_now(),
798
+ completed_at=None,
799
+ completeness_pct=None,
800
+ overall_grade=None,
801
+ error=None,
802
+ )
803
  try:
804
  # Lazy imports — these touch the LLM client + DuckDB; we don't
805
  # want to pay that cost at module import time.
 
828
  "[upload-extract] read_full_text failed %s: %s: %s",
829
  policy_id, type(e).__name__, e,
830
  )
831
+ await _set_extraction_status(
832
+ policy_id, status="failed",
833
+ completed_at=_now(),
834
+ error=f"read_full_text: {type(e).__name__}: {str(e)[:160]}",
835
+ )
836
  return False
837
 
838
  prompt = build_extract_prompt(text, schema_excerpt(), policy_id)
 
881
  "[upload-extract] no policy extracted for %s after retries; "
882
  "card stays on heuristic record", policy_id,
883
  )
884
+ await _set_extraction_status(
885
+ policy_id, status="failed",
886
+ completed_at=_now(),
887
+ error="LLM returned no valid HealthPolicy after primary + fallback retries",
888
+ )
889
  return False
890
 
891
  # Write rag/extracted/<policy_id>.json — same shape as catalogued.
 
924
  "[upload-extract] OK %s (extraction_confidence_pct=%s)",
925
  policy_id, getattr(policy, "extraction_confidence_pct", "n/a"),
926
  )
927
+
928
+ # Resolve the freshly-graded card so the status can report
929
+ # the actual completeness + grade the chat card will show.
930
+ _final_completeness = None
931
+ _final_grade = None
932
+ try:
933
+ import backend.main as _bm2
934
+ from backend.scorecard import build_scorecard as _bs
935
+ _doc = policy.model_dump()
936
+ _sc = _bs(_doc, profile=None)
937
+ if _sc is not None:
938
+ _final_completeness = float(_sc.data_completeness_pct)
939
+ _final_grade = _sc.overall_grade
940
+ except Exception: # noqa: BLE001
941
+ pass
942
+
943
+ await _set_extraction_status(
944
+ policy_id, status="complete",
945
+ completed_at=_now(),
946
+ completeness_pct=_final_completeness,
947
+ overall_grade=_final_grade,
948
+ )
949
  return True
950
  except Exception as e: # noqa: BLE001 — top-level catch-all
951
+ try:
952
+ await _set_extraction_status(
953
+ policy_id, status="failed",
954
+ completed_at=_now(),
955
+ error=f"{type(e).__name__}: {str(e)[:200]}",
956
+ )
957
+ except Exception:
958
+ pass
959
  _log.warning(
960
  "[upload-extract] unexpected failure for %s: %s: %s",
961
  policy_id, type(e).__name__, str(e)[:400],
frontend/src/app/page.tsx CHANGED
@@ -196,6 +196,12 @@ export default function Page() {
196
  }
197
  }, [messages]);
198
  const [uploadStatus, setUploadStatus] = useState<string | null>(null);
 
 
 
 
 
 
199
  // KI-027 (2026-05-14) — voice UX simplification. The legacy `handsFree`
200
  // mode (its own VAD auto-cutoff + post-turn mic re-open loop) has been
201
  // removed. We now have exactly two voice paths, mutually exclusive:
@@ -971,14 +977,14 @@ export default function Page() {
971
  voiceSubmitRef.current = (text: string) => {
972
  const t = text.trim();
973
  if (t.length < 2) return;
974
- // Suppress voice auto-submit while a PDF upload is in flight or
975
- // just-completed (uploadStatus is non-null for ~8s after success
976
- // / failure). A long upload + active mic + bot's TTS playing
977
- // through speakers can otherwise auto-transcribe ambient sound
978
- // and fire an "unprompted analysis" chat turn that drowns the
979
- // upload-flow's choice prompt. Real user input still goes
980
- // through the typed-input path / explicit Push-to-talk press.
981
- if (uploadStatus) return;
982
  // V4 FIX 2 — dedup repeated finals within 500ms.
983
  const { text: prevText, at: prevAt } = lastFinalTextRef.current;
984
  const now = Date.now();
@@ -994,7 +1000,7 @@ export default function Page() {
994
  // send() reads `messages` / `sessionId` / `ttsLang` / view flags via
995
  // closure; rebind whenever they change so the latest values are used.
996
  // eslint-disable-next-line react-hooks/exhaustive-deps
997
- }, [messages, sessionId, ttsLang, openPolicy, showMarketplace, showProfile, showPremium, uploadStatus]);
998
 
999
  async function startRecording() {
1000
  // KI-222 FIX 1 — silence any prior bot TTS BEFORE PTT recording starts.
@@ -1448,57 +1454,106 @@ export default function Page() {
1448
  async function handleFile(ev: React.ChangeEvent<HTMLInputElement>) {
1449
  const f = ev.target.files?.[0];
1450
  if (!f) return;
1451
- // Earlier iteration also pushUser'd a "📎 Uploaded: <name>" breadcrumb
1452
- // into the transcript. That leaked into chat_history so a subsequent
1453
- // voice auto-fire (mic catching ambient sound during the long index
1454
- // wait) could trigger the brain to "analyse" the upload unprompted.
1455
- // Removed: the card rendered below the ack message is itself the
1456
- // user-visible breadcrumb that the upload happened. uploadStatus
1457
- // gates the voice-submit path during the indexing window.
 
 
 
 
 
 
 
 
1458
  setUploadStatus(t("upload.indexing", { name: f.name }));
 
1459
  try {
1460
  // Pass the live chat session so the backend scopes the uploaded doc
1461
  // to this user — the assistant can then answer questions about it
1462
  // for the rest of THIS conversation.
1463
  const r = await uploadPolicy(f, sessionId);
1464
  setUploadStatus(t("upload.success", { name: r.policy_name }));
1465
- // ── In-chat acknowledgment + inline scorecard card ──────────────
1466
- // Push two assistant messages:
1467
- // 1. The "got it, here's the card" ack with a `citations` array
1468
- // carrying the uploaded policy_id. The existing chat renderer
1469
- // reads citations from an assistant message and fires
1470
- // `getScorecard(policy_id, session_id)` per cited policy, so
1471
- // the scorecard card appears inline under the bubble — same
1472
- // treatment as a recommendation card.
1473
- // 2. The proceed-choice prompt — telling the user they can
1474
- // finish their profile OR dive into the PDF, and noting that
1475
- // a fuller profile makes the policy discussion more useful.
1476
- const ackText = t("upload.chat_ack", { name: r.policy_name });
1477
- pushAssistant(ackText, {
1478
- citations: [
1479
- {
1480
- policy_id: r.policy_id,
1481
- policy_name: r.policy_name,
1482
- insurer_slug: "user-upload",
1483
- page_start: 1,
1484
- page_end: r.pages_indexed,
1485
- source_url: "",
1486
- score: 1.0,
1487
- },
1488
- ],
1489
- });
1490
  pushAssistant(t("upload.chat_choice"));
1491
  // Refresh coverage so the uploaded doc shows up
1492
  getCoverage().then(setCoverage).catch(() => {});
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1493
  } catch (e: unknown) {
1494
  const errMsg = e instanceof Error ? e.message : String(e);
1495
  setUploadStatus(t("upload.error", { err: errMsg }));
1496
- // Surface the failure in chat too — a transient banner alone is
1497
- // easy to miss (originally reported as "no acknowledgment").
1498
  pushAssistant(t("upload.error", { err: errMsg }));
1499
  } finally {
1500
  if (fileInputRef.current) fileInputRef.current.value = "";
1501
  setTimeout(() => setUploadStatus(null), 8000);
 
1502
  }
1503
  }
1504
 
 
196
  }
197
  }, [messages]);
198
  const [uploadStatus, setUploadStatus] = useState<string | null>(null);
199
+ // ADR-044 (2026-05-27) — extractionInFlight stays true from the moment
200
+ // a PDF starts uploading until the background LLM extraction either
201
+ // completes or hits its hard timeout. Voice auto-submit is gated on
202
+ // this so ambient noise / TTS playback during the 30-60s extraction
203
+ // window can no longer fire an unprompted chat turn.
204
+ const [extractionInFlight, setExtractionInFlight] = useState<boolean>(false);
205
  // KI-027 (2026-05-14) — voice UX simplification. The legacy `handsFree`
206
  // mode (its own VAD auto-cutoff + post-turn mic re-open loop) has been
207
  // removed. We now have exactly two voice paths, mutually exclusive:
 
977
  voiceSubmitRef.current = (text: string) => {
978
  const t = text.trim();
979
  if (t.length < 2) return;
980
+ // Suppress voice auto-submit while a PDF upload is in flight OR
981
+ // while the background LLM extraction is still running (ADR-044).
982
+ // A long upload + active mic + bot's TTS playing through speakers
983
+ // can otherwise auto-transcribe ambient sound and fire an
984
+ // "unprompted analysis" chat turn that drowns the upload-flow's
985
+ // choice prompt. Real user input still goes through the
986
+ // typed-input path / explicit Push-to-talk press.
987
+ if (uploadStatus || extractionInFlight) return;
988
  // V4 FIX 2 — dedup repeated finals within 500ms.
989
  const { text: prevText, at: prevAt } = lastFinalTextRef.current;
990
  const now = Date.now();
 
1000
  // send() reads `messages` / `sessionId` / `ttsLang` / view flags via
1001
  // closure; rebind whenever they change so the latest values are used.
1002
  // eslint-disable-next-line react-hooks/exhaustive-deps
1003
+ }, [messages, sessionId, ttsLang, openPolicy, showMarketplace, showProfile, showPremium, uploadStatus, extractionInFlight]);
1004
 
1005
  async function startRecording() {
1006
  // KI-222 FIX 1 — silence any prior bot TTS BEFORE PTT recording starts.
 
1454
  async function handleFile(ev: React.ChangeEvent<HTMLInputElement>) {
1455
  const f = ev.target.files?.[0];
1456
  if (!f) return;
1457
+ // ADR-044 (2026-05-27) new staged upload flow:
1458
+ // 1. POST /api/upload-policy indexes + persists + kicks the
1459
+ // background LLM extraction.
1460
+ // 2. push assistant ack (NO card yet we don't render the card
1461
+ // on the partial heuristic record).
1462
+ // 3. push choice prompt (finish profile / dive into PDF).
1463
+ // 4. poll /api/upload/extraction-status/{id} every 3s for up to
1464
+ // 120s. While polling, `extractionInFlight=true` blocks voice
1465
+ // auto-submit so ambient sound during the wait can't trigger
1466
+ // an "unprompted please-upload" chat turn.
1467
+ // 5. when status === "complete", push a NEW assistant message
1468
+ // with the citations → card renders inline at that point
1469
+ // with FULL data (catalogued-grade depth).
1470
+ // 6. on "failed" / timeout, push a fallback ack with whatever
1471
+ // heuristic data we have, so the user is never stranded.
1472
  setUploadStatus(t("upload.indexing", { name: f.name }));
1473
+ setExtractionInFlight(true);
1474
  try {
1475
  // Pass the live chat session so the backend scopes the uploaded doc
1476
  // to this user — the assistant can then answer questions about it
1477
  // for the rest of THIS conversation.
1478
  const r = await uploadPolicy(f, sessionId);
1479
  setUploadStatus(t("upload.success", { name: r.policy_name }));
1480
+ // Step 2 ack (NO citations yet → no card rendered)
1481
+ pushAssistant(t("upload.chat_ack_reading", { name: r.policy_name }));
1482
+ // Step 3 choice prompt
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1483
  pushAssistant(t("upload.chat_choice"));
1484
  // Refresh coverage so the uploaded doc shows up
1485
  getCoverage().then(setCoverage).catch(() => {});
1486
+
1487
+ // Step 4 — poll extraction status
1488
+ const POLL_INTERVAL_MS = 3000;
1489
+ const MAX_TRIES = 40; // 40 × 3s = 120s
1490
+ let landed = false;
1491
+ let finalCompleteness: number | null = null;
1492
+ let finalGrade: string | null = null;
1493
+ let finalInsurerSlug: string = r.policy_id.startsWith("user-upload__") ? "user-upload" : "";
1494
+ for (let i = 0; i < MAX_TRIES; i++) {
1495
+ try {
1496
+ const resp = await fetch(
1497
+ `${BACKEND_URL}/api/upload/extraction-status/${encodeURIComponent(r.policy_id)}`,
1498
+ );
1499
+ if (resp.ok) {
1500
+ const s = await resp.json();
1501
+ if (s.status === "complete") {
1502
+ landed = true;
1503
+ finalCompleteness = s.completeness_pct ?? null;
1504
+ finalGrade = s.overall_grade ?? null;
1505
+ finalInsurerSlug = s.insurer_slug || finalInsurerSlug;
1506
+ break;
1507
+ }
1508
+ if (s.status === "failed") {
1509
+ break;
1510
+ }
1511
+ // pending / running / unknown — keep polling
1512
+ if (s.insurer_slug) finalInsurerSlug = s.insurer_slug;
1513
+ }
1514
+ } catch (_) {
1515
+ // tolerant of transient fetch errors; keep polling
1516
+ }
1517
+ await new Promise((res) => setTimeout(res, POLL_INTERVAL_MS));
1518
+ }
1519
+
1520
+ // Step 5 — push the card-bearing assistant message
1521
+ if (landed) {
1522
+ pushAssistant(
1523
+ t("upload.chat_card_ready", { name: r.policy_name }),
1524
+ {
1525
+ citations: [
1526
+ {
1527
+ policy_id: r.policy_id,
1528
+ policy_name: r.policy_name,
1529
+ insurer_slug: finalInsurerSlug || "user-upload",
1530
+ page_start: 1,
1531
+ page_end: r.pages_indexed,
1532
+ source_url: "",
1533
+ score: 1.0,
1534
+ },
1535
+ ],
1536
+ },
1537
+ );
1538
+ } else {
1539
+ // Step 6 — fallback. We DON'T render a card on the heuristic
1540
+ // stub (per user directive — "lets generate the card inline
1541
+ // ONLY after full data extraction"), so on timeout / failure
1542
+ // just tell the user the deep-analysis didn't complete and
1543
+ // they can ask questions about the PDF directly.
1544
+ pushAssistant(
1545
+ t("upload.chat_extraction_failed", { name: r.policy_name }),
1546
+ );
1547
+ }
1548
  } catch (e: unknown) {
1549
  const errMsg = e instanceof Error ? e.message : String(e);
1550
  setUploadStatus(t("upload.error", { err: errMsg }));
1551
+ // Surface the failure in chat too.
 
1552
  pushAssistant(t("upload.error", { err: errMsg }));
1553
  } finally {
1554
  if (fileInputRef.current) fileInputRef.current.value = "";
1555
  setTimeout(() => setUploadStatus(null), 8000);
1556
+ setExtractionInFlight(false);
1557
  }
1558
  }
1559
 
frontend/src/lib/i18n.ts CHANGED
@@ -43,6 +43,9 @@ export const UI_STRINGS = {
43
  "upload.error": "✗ Upload failed: ${err}",
44
  "upload.user_msg": "📎 Uploaded: ${name}",
45
  "upload.chat_ack": "Got it — I've read **${name}**. Here's how it grades against what we know about you so far:",
 
 
 
46
  "upload.chat_choice": "How would you like to proceed?\n\n• **Tell me more about yourself** — finish the short profile (age, family, location, budget, health) so I can speak to this policy more personally.\n• **Dive into the PDF first** — ask questions about coverage, waiting periods, exclusions, anything in the document.\n\nEither works. The more I know about you, the more useful the discussion of this policy will be.",
47
 
48
  // Marketplace panel
@@ -164,6 +167,9 @@ export const UI_STRINGS = {
164
  "upload.error": "✗ Upload विफल: ${err}",
165
  "upload.user_msg": "📎 Upload किया: ${name}",
166
  "upload.chat_ack": "मिल गया — **${name}** पढ़ ली। यह आपके profile के हिसाब से कैसी है:",
 
 
 
167
  "upload.chat_choice": "आगे कैसे बढ़ें?\n\n• **अपने बारे में बताएं** — short profile पूरा करें (उम्र, परिवार, location, बजट, health) ताकि मैं इस policy पर आपको personally बात कर सकूं।\n• **पहले PDF पर बात करें** — coverage, waiting periods, exclusions — कुछ भी पूछें।\n\nदोनों ठीक हैं। जितना मैं आपके बारे में जानूंगा, इस policy की चर्चा उतनी useful होगी।",
168
 
169
  "mp.heading": "स्वास्थ्य बीमा बाज़ार",
 
43
  "upload.error": "✗ Upload failed: ${err}",
44
  "upload.user_msg": "📎 Uploaded: ${name}",
45
  "upload.chat_ack": "Got it — I've read **${name}**. Here's how it grades against what we know about you so far:",
46
+ "upload.chat_ack_reading": "Got it — I've received **${name}**. Give me a moment to read it through fully (about 30–60 seconds) and I'll bring back a complete picture for you.",
47
+ "upload.chat_card_ready": "Here's the full picture of **${name}** — graded against what we know about you so far:",
48
+ "upload.chat_extraction_failed": "I couldn't pull a full analysis from this PDF this time. You can still ask me about anything inside the document — I have the full text indexed and can quote the exact wording.",
49
  "upload.chat_choice": "How would you like to proceed?\n\n• **Tell me more about yourself** — finish the short profile (age, family, location, budget, health) so I can speak to this policy more personally.\n• **Dive into the PDF first** — ask questions about coverage, waiting periods, exclusions, anything in the document.\n\nEither works. The more I know about you, the more useful the discussion of this policy will be.",
50
 
51
  // Marketplace panel
 
167
  "upload.error": "✗ Upload विफल: ${err}",
168
  "upload.user_msg": "📎 Upload किया: ${name}",
169
  "upload.chat_ack": "मिल गया — **${name}** पढ़ ली। यह आपके profile के हिसाब से कैसी है:",
170
+ "upload.chat_ack_reading": "मिल गया — **${name}** मिल गई। थोड़ी देर दें (~30-60 सेकंड), पूरी तरह पढ़कर पूरा analysis लाता हूँ।",
171
+ "upload.chat_card_ready": "**${name}** का पूरा analysis — आपके profile के हिसाब से:",
172
+ "upload.chat_extraction_failed": "इस PDF का पूरा analysis इस बार नहीं निकाल पाया। फिर भी आप document के बारे में कुछ भी पूछ सकते हैं — पूरा text indexed है, exact wording quote कर सकता हूँ।",
173
  "upload.chat_choice": "आगे कैसे बढ़ें?\n\n• **अपने बारे में बताएं** — short profile पूरा करें (उम्र, परिवार, location, बजट, health) ताकि मैं इस policy पर आपको personally बात कर सकूं।\n• **पहले PDF पर बात करें** — coverage, waiting periods, exclusions — कुछ भी पूछें।\n\nदोनों ठीक हैं। जितना मैं आपके बारे में जानूंगा, इस policy की चर्चा उतनी useful होगी।",
174
 
175
  "mp.heading": "स्वास्थ्य बीमा बाज़ार",