Spaces:

ayushKishor
/

plutoV2_miniProject_3rd-yr

Sleeping

App Files Files Community

ayushKishor commited on May 1

Commit

23cdeed

1 Parent(s): 691e458

Add Pluto memory layer and pipeline fixes

Browse files

Files changed (49) hide show

.dockerignore +16 -0
.gitignore +46 -0
README.md +177 -6
app.py +0 -0
mp1/.env +0 -15
mp1/.env.example +2 -0
mp1/benchmark/compare.py +7 -7
mp1/corpus/.doc_index.json +0 -0
mp1/corpus/.extraction_cache.json +0 -0
mp1/frontend/app.js +36 -12
mp1/frontend/index.html +4 -4
mp1/main.py +2 -1
mp1/nvidia_models.json +0 -1133
mp1/pluto/__init__.py +1 -0
mp1/pluto/bus.py +1 -0
mp1/pluto/chunker.py +1 -0
mp1/pluto/db.py +66 -0
mp1/pluto/dispatcher.py +2 -1
mp1/pluto/doc_index.py +1 -0
mp1/pluto/doc_summary.py +173 -0
mp1/pluto/embedder.py +1 -0
mp1/pluto/extraction_cache.py +1 -0
mp1/pluto/ingest.py +23 -8
mp1/pluto/models.py +7 -131
mp1/pluto/modes.py +75 -3
mp1/pluto/pipeline.py +50 -19
mp1/pluto/server.py +163 -29
mp1/pluto/session_memory.py +230 -0
mp1/pluto/signal_logger.py +93 -0
mp1/pluto/stages/__init__.py +2 -1
mp1/pluto/stages/{verify.py → evidence_check.py} +48 -67
mp1/pluto/stages/extract.py +1 -0
mp1/pluto/stages/merge.py +7 -114
mp1/pluto/stages/route.py +1 -0
mp1/pluto/stages/understand.py +1 -0
mp1/pluto/tools.py +3 -0
mp1/pluto/tracer.py +1 -0
mp1/pluto/utils.py +1 -96
mp1/requirements.txt +2 -1
mp1/scripts/generate_app_summary_pdf.py +367 -0
mp1/test_doc_summary.py +75 -0
mp1/test_merge.py +1 -115
mp1/test_schema.py +0 -41
mp1/test_server.py +42 -18
mp1/test_session_memory.py +112 -0
mp1/test_signal_logger.py +56 -0
mp1/test_verify.py +27 -52
pytest.ini +7 -0
requirements.txt +2 -1

.dockerignore ADDED Viewed

	@@ -0,0 +1,16 @@

+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.venv/
+env/
+.env
+.git/
+.gitignore
+.pytest_cache/
+debug.txt
+output_log.txt
+verify_dump.txt
+mp1/debug.txt
+mp1/output_log.txt
+mp1/verify_dump.txt

.gitignore ADDED Viewed

	@@ -0,0 +1,46 @@

+# Secrets and local environments
+.env
+.env.*
+!.env.example
+!.env.sample
+.venv/
+venv/
+env/
+ENV/
+# Python bytecode and tool caches
+__pycache__/
+*.py[cod]
+*.pyo
+.pytest_cache/
+.mypy_cache/
+.ruff_cache/
+.coverage
+htmlcov/
+# Editor and OS noise
+.DS_Store
+Thumbs.db
+.idea/
+.vscode/
+# Runtime logs and debug dumps
+*.log
+*.out
+*.err
+debug.txt
+output_log.txt
+verify_dump.txt
+mp1/server_log*.txt
+mp1/server_ui_*.log
+mp1/test_out.json
+mp1/test_out.txt
+# Generated runtime data
+mp1/output/**
+mp1/tmp/
+mp1/tmp*/
+mp1/pytest-cache-files-*/
+mp1/corpus/.doc_index.json
+mp1/corpus/.extraction_cache.json
+mp1/nvidia_models.json

README.md CHANGED Viewed

@@ -1,11 +1,182 @@
 ---
-title: PlutoV2 MiniProject 3rd-yr
-emoji: 📉
-colorFrom: green
-colorTo: gray
 sdk: docker
 pinned: false
-short_description: pluto_v2
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Pluto Pipeline
+emoji: "📄"
+colorFrom: gray
+colorTo: yellow
 sdk: docker
+app_port: 7860
 pinned: false
 ---
+# Pluto: Real Mode-Switching Extraction Pipeline
+Pluto is a document question-answering system built for research and technical documents. Instead of sending an entire paper to one model and hoping for the best, Pluto separates document understanding from query-time reasoning, routes only relevant chunks, extracts structured claims, merges them into an answer, and verifies support before returning the result.
+The project includes a FastAPI backend, a one-page dashboard, scoped corpus selection, live pipeline progress streaming, evidence-backed answers, confidence reporting, trace summaries, and a baseline comparison view.
+## Why Pluto
+Traditional one-shot PDF chat often struggles with long documents, tables, figures, and answer traceability. Pluto is designed to make that workflow more inspectable and more efficient for project-scale document QA.
+Key goals:
+- query only the relevant parts of a document corpus
+- switch model behavior by chunk type and task difficulty
+- keep document processing reusable across multiple questions
+- surface evidence, agent activity, and confidence to the user
+- support scoped queries to one selected corpus document or the full corpus
+## What The App Does
+- uploads `PDF`, `DOCX/DOC`, `TXT`, and `MD` files into a local corpus
+- converts uploaded files to Markdown and chunks them for retrieval
+- classifies chunks as text, table, figure, code, references, and more
+- runs a staged pipeline: `Route -> Extract -> Merge -> EvidenceCheck`
+- streams live status updates through Server-Sent Events
+- returns a final answer with sections, evidence, trace, confidence, and gaps
+- compares Pluto against a simpler single-model baseline in the benchmark panel
+## Architecture
+```mermaid
+flowchart LR
+    A["Frontend Dashboard"] --> B["FastAPI Server"]
+    B --> C["Upload + Corpus APIs"]
+    B --> D["PipelineRunner"]
+    D --> E["S0 Route"]
+    D --> F["S1 Extract"]
+    D --> G["S2 Merge"]
+    D --> H["S3 EvidenceCheck"]
+    C --> I["DocIndex"]
+    C --> J["Corpus Files"]
+    F --> K["ExtractionCache"]
+    D --> L["Tracer + MessageBus"]
+    B --> M["SSE Progress Stream"]
+```
+## Pipeline Overview
+Pluto operates in two broad phases:
+1. Document understanding
+2. Query-time extraction and answer synthesis
+At query time the main flow is:
+1. `S0 Route`
+   Picks relevant chunks, applies document scope, and assigns a processing mode.
+2. `S1 Extract`
+   Extracts structured claims from selected chunks and reuses cached extraction results when possible.
+3. `S2 Merge`
+   Combines claims into answer sections, open gaps, and key claims.
+4. `S3 EvidenceCheck`
+   Checks whether synthesized claims are present in retrieved chunk text using token overlap and an optional LLM confirmation call.
+## Tech Stack
+- Backend: `FastAPI`, `Uvicorn`, `Pydantic`
+- Frontend: custom `HTML + CSS + vanilla JavaScript`
+- Document parsing: `pdfplumber`, `python-docx`
+- Runtime config: `python-dotenv`
+- Testing: `pytest`
+- Providers: NVIDIA-hosted models when available, with Groq and Mistral fallback paths in the runtime
+## Repo Layout
+```text
+mini-project_3rd_yr-main/
+├─ Dockerfile
+├─ README.md
+├─ pytest.ini
+├─ hf_space/
+└─ mp1/
+   ├─ main.py
+   ├─ requirements.txt
+   ├─ frontend/
+   ├─ pluto/
+   ├─ benchmark/
+   ├─ scripts/
+   ├─ corpus/
+   └─ test_*.py
+```
+Important directories:
+- `mp1/frontend/`: dashboard UI
+- `mp1/pluto/`: backend server, pipeline, stages, routing, caching, tracing
+- `mp1/benchmark/`: Pluto vs baseline comparison logic
+- `mp1/corpus/`: local document corpus and generated corpus state
+- `mp1/scripts/`: utility scripts such as the one-page PDF generator
+## Quick Start
+### 1. Install dependencies
+```bash
+pip install -r mp1/requirements.txt
+```
+### 2. Create your environment file
+Use the example file in [`mp1/.env.example`](mp1/.env.example) and create `mp1/.env`.
+Minimum practical setup:
+- set `NVIDIA_API_KEY` for the NVIDIA-backed stack
+- or set `GROQ_API_KEY` for the fallback stack
+### 3. Run the dashboard
+```bash
+python mp1/main.py --serve --port 8000
+```
+Open `http://127.0.0.1:8000`.
+### 4. Optional CLI run
+```bash
+python mp1/main.py --query "What is this paper about?" --corpus mp1/corpus --output mp1/output
+```
+## Environment Variables
+Runtime code in the repo references these variables:
+- `NVIDIA_API_KEY`
+- `NVIDIA_API_KEY_NANO`
+- `NVIDIA_API_KEY_SUPER`
+- `NVIDIA_API_KEY_VL`
+- `NVIDIA_API_KEY_EMBED`
+- `NVIDIA_API_KEY_RERANK`
+- `NVIDIA_API_KEY_ULTRA`
+- `GROQ_API_KEY`
+- `MISTRAL_API_KEY`
+In practice, the simplest starting point is either:
+- one NVIDIA key through `NVIDIA_API_KEY`
+- or one Groq key through `GROQ_API_KEY`
+## Useful Endpoints
+- `POST /api/run`
+- `GET /api/stream`
+- `POST /api/upload`
+- `GET /api/corpus`
+- `GET /api/doc-status/{doc_id}`
+- `POST /api/compare`
+## Tests
+A focused local suite used during development:
+```bash
+pytest mp1/test_server.py mp1/test_route.py mp1/test_merge.py mp1/test_verify.py mp1/test_doc_index.py -q
+```
+## Notes
+- generated runtime artifacts, logs, temp folders, local caches, and secret files are intentionally excluded through `.gitignore`
+- `mp1/output/` is treated as generated output, not source code
+- corpus metadata such as `mp1/corpus/.doc_index.json` and `mp1/corpus/.extraction_cache.json` is runtime state

app.py CHANGED Viewed

Binary files a/app.py and b/app.py differ

mp1/.env DELETED Viewed

@@ -1,15 +0,0 @@
-# NVIDIA NIM Multi-model Keys
-NVIDIA_API_KEY_NANO=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
-NVIDIA_API_KEY_SUPER=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
-NVIDIA_API_KEY_VL=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
-NVIDIA_API_KEY_EMBED=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
-NVIDIA_API_KEY_RERANK=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
-NVIDIA_API_KEY_ULTRA=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
-# Global fallback (defaults to Super if specific not found)
-NVIDIA_API_KEY=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
-# Keep Groq as fallback
-GROQ_API_KEY=gsk_xxxxxxxxxxxxxxxxxxxx
-MISTRAL_API_KEY=...
-GOOGLE_API_KEY=AIzaSyDp-mzHD9Nyk1T3xCPRyrc1RCiVLZzkNy8

mp1/.env.example CHANGED Viewed

@@ -15,3 +15,5 @@ NVIDIA_API_KEY_ULTRA=
 GROQ_API_KEY=
 MISTRAL_API_KEY=

 GROQ_API_KEY=
 MISTRAL_API_KEY=
+DATABASE_URL=postgresql://user:password@localhost:5432/pluto

mp1/benchmark/compare.py CHANGED Viewed

@@ -30,7 +30,7 @@ def _normalize_detail_level(detail_level: str | None) -> str:
 class SimpleRunner:
     """
     Single-model baseline: one LLM call over top keyword-matched chunks.
-    No routing, no extraction schema, no verification.
     """
     def __init__(self, corpus_dir: str, doc_index=None):
@@ -145,7 +145,7 @@ class ComparisonRunner:
                 selected_doc_ids=selected_doc_ids,
                 detail_level=detail_level,
             ),
-            verified=True,
         )
         baseline_metrics = self._run_side(
             "Baseline",
@@ -154,7 +154,7 @@ class ComparisonRunner:
                 selected_doc_ids=selected_doc_ids,
                 detail_level=detail_level,
             ),
-            verified=False,
         )
         winner = "Unavailable"
@@ -174,7 +174,7 @@ class ComparisonRunner:
             "winner": winner,
         }
-    def _run_side(self, label: str, runner, verified: bool) -> dict:
         start_time = time.time()
         try:
             result = runner()
@@ -183,10 +183,10 @@ class ComparisonRunner:
                 "confidence": round(result.confidence, 2),
                 "evidence_count": len(result.evidence),
                 "chunks_processed": result.trace_summary.chunks_processed,
-                "verified": verified,
                 "answer_preview": (result.final_answer.response or "")[:300],
                 "models_used": result.trace_summary.models_used,
-                "real_switching": result.trace_summary.real_switching if verified else False,
                 "error": None,
             }
         except Exception as exc:
@@ -195,7 +195,7 @@ class ComparisonRunner:
                 "confidence": 0.0,
                 "evidence_count": 0,
                 "chunks_processed": 0,
-                "verified": verified,
                 "answer_preview": f"{label} failed: {exc}"[:300],
                 "models_used": [],
                 "real_switching": False,

 class SimpleRunner:
     """
     Single-model baseline: one LLM call over top keyword-matched chunks.
+    No routing, no extraction schema, no evidence check.
     """
     def __init__(self, corpus_dir: str, doc_index=None):
                 selected_doc_ids=selected_doc_ids,
                 detail_level=detail_level,
             ),
+            evidence_checked=True,
         )
         baseline_metrics = self._run_side(
             "Baseline",
                 selected_doc_ids=selected_doc_ids,
                 detail_level=detail_level,
             ),
+            evidence_checked=False,
         )
         winner = "Unavailable"
             "winner": winner,
         }
+    def _run_side(self, label: str, runner, evidence_checked: bool) -> dict:
         start_time = time.time()
         try:
             result = runner()
                 "confidence": round(result.confidence, 2),
                 "evidence_count": len(result.evidence),
                 "chunks_processed": result.trace_summary.chunks_processed,
+                "evidence_checked": evidence_checked,
                 "answer_preview": (result.final_answer.response or "")[:300],
                 "models_used": result.trace_summary.models_used,
+                "real_switching": result.trace_summary.real_switching if evidence_checked else False,
                 "error": None,
             }
         except Exception as exc:
                 "confidence": 0.0,
                 "evidence_count": 0,
                 "chunks_processed": 0,
+                "evidence_checked": evidence_checked,
                 "answer_preview": f"{label} failed: {exc}"[:300],
                 "models_used": [],
                 "real_switching": False,

mp1/corpus/.doc_index.json DELETED Viewed

The diff for this file is too large to render. See raw diff

mp1/corpus/.extraction_cache.json DELETED Viewed

The diff for this file is too large to render. See raw diff

mp1/frontend/app.js CHANGED Viewed

@@ -6,7 +6,7 @@
     detailLevel: 'pluto.detailLevel',
   };
-  const stages = ['route', 'extract', 'merge', 'verify'];
   const stageEls = {};
   const statusEls = {};
   const connectors = document.querySelectorAll('.stage-rail__connector');
@@ -37,6 +37,10 @@
   let uploadProcessingActive = false;
   let pipelineRunning = false;
   let activeEventSource = null;
   let latestCorpusDocs = [];
   let pendingCorpusDocIds = [];
   let selectedDocIds = loadStoredDocIds();
@@ -155,28 +159,36 @@
     }
     pipelineRunning = true;
     syncControls();
     runBtn.innerHTML = '<span class="spinner"></span> Running...';
     resetUI();
     try {
-      await listenSSE();
       const response = await fetch('/api/run', {
         method: 'POST',
         headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify(buildQueryPayload(query)),
       });
       const data = await parseJsonResponse(response, 'Server returned an invalid response');
       if (!response.ok || data.error) {
         throw new Error(data.error || `Server error: ${response.status}`);
       }
       renderResult(data);
     } catch (error) {
       answerBody.innerHTML = renderErrorCard('Pipeline Error', error.message);
       console.error(error);
     } finally {
       closeActiveStream();
       pipelineRunning = false;
       runBtn.innerHTML = '<span class="btn-icon">&#9654;</span> Run Pipeline';
       syncControls();
     }
@@ -214,19 +226,24 @@
     }
   }
-  function buildQueryPayload(query) {
     return {
       query,
       selected_doc_ids: [...selectedDocIds],
       detail_level: detailLevel,
     };
   }
-  function listenSSE() {
     closeActiveStream();
     return new Promise((resolve, reject) => {
-      const eventSource = new EventSource('/api/stream');
       let opened = false;
       activeEventSource = eventSource;
@@ -334,8 +351,8 @@
         info = `done (${data.extractions} facts)`;
       } else if (stage === 'merge' && data.key_claims) {
         info = `done (${data.key_claims} claims)`;
-      } else if (stage === 'verify' && data.checked) {
-        info = `done (${data.checked} verified)`;
       }
       statusEls[stage].innerHTML = `<span class="status-dot status-dot--complete"></span>${esc(info)}`;
@@ -374,9 +391,9 @@
     const gaps = Array.isArray(data.missing_info) ? data.missing_info : [];
     const nextActions = Array.isArray(data.next_actions) ? data.next_actions : [];
     if (gaps.length) {
-      const gapTitle = nextActions.length ? 'Verification / Coverage Gaps Found' : 'Coverage Gaps Noted';
       const gapIntro = nextActions.length
-        ? 'Some answer points could not be fully verified from the extracted evidence.'
         : 'The detailed answer asked for coverage beyond what the document clearly supports in the selected scope.';
       const gapPrefix = nextActions.length ? 'Need support:' : 'Not clearly covered:';
       html += `
@@ -530,8 +547,8 @@
           <div class="${yesNoClass(stats.real_switching)}">${stats.real_switching ? 'Yes' : 'No'}</div>
         </div>
         <div class="bench-stat">
-          <span class="bench-stat__label">Verified Claims</span>
-          <div class="${yesNoClass(stats.verified)}">${stats.verified ? 'Enabled' : 'Disabled'}</div>
         </div>
         <div class="bench-stat">
           <span class="bench-stat__label">Evidence Count</span>
@@ -808,6 +825,13 @@
     });
   }
   function toggleDocSelection(docId) {
     if (!docId) {
       return;

     detailLevel: 'pluto.detailLevel',
   };
+  const stages = ['route', 'extract', 'merge', 'evidence_check'];
   const stageEls = {};
   const statusEls = {};
   const connectors = document.querySelectorAll('.stage-rail__connector');
   let uploadProcessingActive = false;
   let pipelineRunning = false;
   let activeEventSource = null;
+  let activeSessionId = null;
+  let previousQuery = '';
+  let previousQueryTimestamp = null;
+  let previousSessionId = null;
   let latestCorpusDocs = [];
   let pendingCorpusDocIds = [];
   let selectedDocIds = loadStoredDocIds();
     }
     pipelineRunning = true;
+    const sessionId = createSessionId();
+    const queryTimestamp = Date.now();
+    activeSessionId = sessionId;
     syncControls();
     runBtn.innerHTML = '<span class="spinner"></span> Running...';
     resetUI();
     try {
+      await listenSSE(sessionId);
       const response = await fetch('/api/run', {
         method: 'POST',
         headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify(buildQueryPayload(query, sessionId, queryTimestamp)),
       });
       const data = await parseJsonResponse(response, 'Server returned an invalid response');
+      activeSessionId = data.session_id || sessionId;
       if (!response.ok || data.error) {
         throw new Error(data.error || `Server error: ${response.status}`);
       }
       renderResult(data);
+      previousQuery = query;
+      previousQueryTimestamp = queryTimestamp;
+      previousSessionId = data.session_id || sessionId;
     } catch (error) {
       answerBody.innerHTML = renderErrorCard('Pipeline Error', error.message);
       console.error(error);
     } finally {
       closeActiveStream();
       pipelineRunning = false;
+      activeSessionId = null;
       runBtn.innerHTML = '<span class="btn-icon">&#9654;</span> Run Pipeline';
       syncControls();
     }
     }
   }
+  function buildQueryPayload(query, sessionId = activeSessionId, queryTimestamp = Date.now()) {
     return {
       query,
+      session_id: sessionId,
+      query_timestamp: queryTimestamp,
+      prev_query: previousQuery,
+      prev_query_timestamp: previousQueryTimestamp,
+      prev_session_id: previousSessionId,
       selected_doc_ids: [...selectedDocIds],
       detail_level: detailLevel,
     };
   }
+  function listenSSE(sessionId) {
     closeActiveStream();
     return new Promise((resolve, reject) => {
+      const eventSource = new EventSource(`/api/stream?session_id=${encodeURIComponent(sessionId)}`);
       let opened = false;
       activeEventSource = eventSource;
         info = `done (${data.extractions} facts)`;
       } else if (stage === 'merge' && data.key_claims) {
         info = `done (${data.key_claims} claims)`;
+      } else if (stage === 'evidence_check' && data.checked) {
+        info = `done (${data.checked} checked)`;
       }
       statusEls[stage].innerHTML = `<span class="status-dot status-dot--complete"></span>${esc(info)}`;
     const gaps = Array.isArray(data.missing_info) ? data.missing_info : [];
     const nextActions = Array.isArray(data.next_actions) ? data.next_actions : [];
     if (gaps.length) {
+      const gapTitle = nextActions.length ? 'Evidence Check / Coverage Gaps Found' : 'Coverage Gaps Noted';
       const gapIntro = nextActions.length
+        ? 'Some answer points could not be fully supported from the extracted evidence.'
         : 'The detailed answer asked for coverage beyond what the document clearly supports in the selected scope.';
       const gapPrefix = nextActions.length ? 'Need support:' : 'Not clearly covered:';
       html += `
           <div class="${yesNoClass(stats.real_switching)}">${stats.real_switching ? 'Yes' : 'No'}</div>
         </div>
         <div class="bench-stat">
+          <span class="bench-stat__label">Evidence Check</span>
+          <div class="${yesNoClass(stats.evidence_checked)}">${stats.evidence_checked ? 'Enabled' : 'Disabled'}</div>
         </div>
         <div class="bench-stat">
           <span class="bench-stat__label">Evidence Count</span>
     });
   }
+  function createSessionId() {
+    if (window.crypto && typeof window.crypto.randomUUID === 'function') {
+      return window.crypto.randomUUID();
+    }
+    return `session-${Date.now()}-${Math.random().toString(16).slice(2)}`;
+  }
   function toggleDocSelection(docId) {
     if (!docId) {
       return;

mp1/frontend/index.html CHANGED Viewed

@@ -110,10 +110,10 @@
         <div class="stage-card__status" id="status-merge">idle</div>
       </div>
       <div class="stage-rail__connector"></div>
-      <div class="stage-card" data-stage="verify" id="stage-verify">
         <div class="stage-card__number">S3</div>
-        <div class="stage-card__label">VERIFY</div>
-        <div class="stage-card__status" id="status-verify">idle</div>
       </div>
     </div>
   </section>
@@ -182,7 +182,7 @@
   </main>
   <footer class="footer">
-    <span>Pluto v2 Pipeline | Deterministic routing | Real model switching | Evidence verification</span>
   </footer>
   <script src="/static/app.js?v=5"></script>

         <div class="stage-card__status" id="status-merge">idle</div>
       </div>
       <div class="stage-rail__connector"></div>
+      <div class="stage-card" data-stage="evidence_check" id="stage-evidence_check">
         <div class="stage-card__number">S3</div>
+        <div class="stage-card__label">EVIDENCE CHECK</div>
+        <div class="stage-card__status" id="status-evidence_check">idle</div>
       </div>
     </div>
   </section>
   </main>
   <footer class="footer">
+    <span>Pluto v2 Pipeline | Deterministic routing | Real model switching | Evidence checking</span>
   </footer>
   <script src="/static/app.js?v=5"></script>

mp1/main.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 main.py — CLI entry point for the Pluto pipeline.
@@ -90,7 +91,7 @@ def _start_server(port: int):
 def _stage_num(stage: str) -> int:
-    return {"route": 0, "extract": 1, "merge": 2, "verify": 3, "finish": 4}.get(stage, -1)
 if __name__ == "__main__":

+# -*- coding: utf-8 -*-
 """
 main.py — CLI entry point for the Pluto pipeline.
 def _stage_num(stage: str) -> int:
+    return {"route": 0, "extract": 1, "merge": 2, "evidence_check": 3, "finish": 4}.get(stage, -1)
 if __name__ == "__main__":

mp1/nvidia_models.json DELETED Viewed

@@ -1,1133 +0,0 @@
-{
-  "object": "list",
-  "data": [
-    {
-      "id": "01-ai/yi-large",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "01-ai"
-    },
-    {
-      "id": "abacusai/dracarys-llama-3.1-70b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "abacusai"
-    },
-    {
-      "id": "adept/fuyu-8b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "adept"
-    },
-    {
-      "id": "ai21labs/jamba-1.5-large-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ai21labs"
-    },
-    {
-      "id": "ai21labs/jamba-1.5-mini-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ai21labs"
-    },
-    {
-      "id": "aisingapore/sea-lion-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "aisingapore"
-    },
-    {
-      "id": "baai/bge-m3",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "baai"
-    },
-    {
-      "id": "baichuan-inc/baichuan2-13b-chat",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "baichuan-inc"
-    },
-    {
-      "id": "bigcode/starcoder2-15b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "bigcode"
-    },
-    {
-      "id": "bigcode/starcoder2-7b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "bigcode"
-    },
-    {
-      "id": "bytedance/seed-oss-36b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "bytedance"
-    },
-    {
-      "id": "databricks/dbrx-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "databricks"
-    },
-    {
-      "id": "deepseek-ai/deepseek-coder-6.7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "deepseek-ai/deepseek-r1-distill-llama-8b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "deepseek-ai/deepseek-r1-distill-qwen-14b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "deepseek-ai/deepseek-r1-distill-qwen-32b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "deepseek-ai/deepseek-r1-distill-qwen-7b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "deepseek-ai/deepseek-v3.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "deepseek-ai/deepseek-v3.1-terminus",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "deepseek-ai/deepseek-v3.2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "deepseek-ai"
-    },
-    {
-      "id": "google/codegemma-1.1-7b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/codegemma-7b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/deplot",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-2-27b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-2-2b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-2-9b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-2b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-3-12b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-3-1b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-3-27b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-3-4b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-3n-e2b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-3n-e4b-it",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/gemma-7b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/paligemma",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/recurrentgemma-2b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "google/shieldgemma-9b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "google"
-    },
-    {
-      "id": "gotocompany/gemma-2-9b-cpt-sahabatai-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "gotocompany"
-    },
-    {
-      "id": "ibm/granite-3.0-3b-a800m-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ibm"
-    },
-    {
-      "id": "ibm/granite-3.0-8b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ibm"
-    },
-    {
-      "id": "ibm/granite-3.3-8b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ibm"
-    },
-    {
-      "id": "ibm/granite-34b-code-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ibm"
-    },
-    {
-      "id": "ibm/granite-8b-code-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ibm"
-    },
-    {
-      "id": "ibm/granite-guardian-3.0-8b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "ibm"
-    },
-    {
-      "id": "igenius/colosseum_355b_instruct_16k",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "igenius"
-    },
-    {
-      "id": "igenius/italia_10b_instruct_16k",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "igenius"
-    },
-    {
-      "id": "institute-of-science-tokyo/llama-3.1-swallow-70b-instruct-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "institute-of-science-tokyo"
-    },
-    {
-      "id": "institute-of-science-tokyo/llama-3.1-swallow-8b-instruct-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "institute-of-science-tokyo"
-    },
-    {
-      "id": "marin/marin-8b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "marin"
-    },
-    {
-      "id": "mediatek/breeze-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mediatek"
-    },
-    {
-      "id": "meta/codellama-70b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.1-405b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.1-70b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.1-8b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.2-11b-vision-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.2-1b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.2-3b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.2-90b-vision-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-3.3-70b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-4-maverick-17b-128e-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-4-scout-17b-16e-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama-guard-4-12b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama2-70b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama3-70b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "meta/llama3-8b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "meta"
-    },
-    {
-      "id": "microsoft/kosmos-2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3-medium-128k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3-medium-4k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3-mini-128k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3-mini-4k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3-small-128k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3-small-8k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3-vision-128k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3.5-mini-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3.5-moe-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-3.5-vision-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-4-mini-flash-reasoning",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-4-mini-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "microsoft/phi-4-multimodal-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "microsoft"
-    },
-    {
-      "id": "minimaxai/minimax-m2.5",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "minimaxai"
-    },
-    {
-      "id": "mistralai/codestral-22b-instruct-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/devstral-2-123b-instruct-2512",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/magistral-small-2506",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mamba-codestral-7b-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mathstral-7b-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/ministral-14b-instruct-2512",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-7b-instruct-v0.2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-7b-instruct-v0.3",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-large",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-large-2-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-large-3-675b-instruct-2512",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-medium-3-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-nemotron",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-small-24b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-small-3.1-24b-instruct-2503",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mistral-small-4-119b-2603",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mixtral-8x22b-instruct-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mixtral-8x22b-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "mistralai/mixtral-8x7b-instruct-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "mistralai"
-    },
-    {
-      "id": "moonshotai/kimi-k2-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "moonshotai"
-    },
-    {
-      "id": "moonshotai/kimi-k2-instruct-0905",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "moonshotai"
-    },
-    {
-      "id": "moonshotai/kimi-k2-thinking",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "moonshotai"
-    },
-    {
-      "id": "moonshotai/kimi-k2.5",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "moonshotai"
-    },
-    {
-      "id": "nv-mistralai/mistral-nemo-12b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nv-mistralai"
-    },
-    {
-      "id": "nvidia/cosmos-reason2-8b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/embed-qa-4",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/gliner-pii",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemoguard-8b-content-safety",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemoguard-8b-topic-control",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-51b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-70b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-70b-reward",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-nano-4b-v1.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-nano-8b-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-nano-vl-8b-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-safety-guard-8b-v3",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.1-nemotron-ultra-253b-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.2-nemoretriever-1b-vlm-embed-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.2-nemoretriever-300m-embed-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.2-nv-embedqa-1b-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.2-nv-embedqa-1b-v2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.3-nemotron-super-49b-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-3.3-nemotron-super-49b-v1.5",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-nemotron-embed-1b-v2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama-nemotron-embed-vl-1b-v2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama3-chatqa-1.5-70b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/llama3-chatqa-1.5-8b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/mistral-nemo-minitron-8b-8k-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/mistral-nemo-minitron-8b-base",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemoretriever-parse",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-3-nano-30b-a3b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-3-super-120b-a12b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-4-340b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-4-340b-reward",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-4-mini-hindi-4b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-content-safety-reasoning-4b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-mini-4b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-nano-12b-v2-vl",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-nano-3-30b-a3b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nemotron-parse",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/neva-22b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nv-embed-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nv-embedcode-7b-v1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nv-embedqa-e5-v5",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nv-embedqa-mistral-7b-v2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nvclip",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/nvidia-nemotron-nano-9b-v2",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/riva-translate-4b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/riva-translate-4b-instruct-v1.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/streampetr",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/usdcode-llama-3.1-70b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "nvidia/vila",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "nvidia"
-    },
-    {
-      "id": "openai/gpt-oss-120b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "openai"
-    },
-    {
-      "id": "openai/gpt-oss-120b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "openai"
-    },
-    {
-      "id": "openai/gpt-oss-20b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "openai"
-    },
-    {
-      "id": "openai/gpt-oss-20b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "openai"
-    },
-    {
-      "id": "opengpt-x/teuken-7b-instruct-commercial-v0.4",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "opengpt-x"
-    },
-    {
-      "id": "qwen/qwen2-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen2.5-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen2.5-coder-32b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen2.5-coder-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen3-coder-480b-a35b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen3-next-80b-a3b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen3-next-80b-a3b-thinking",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen3.5-122b-a10b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwen3.5-397b-a17b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "qwen/qwq-32b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "qwen"
-    },
-    {
-      "id": "rakuten/rakutenai-7b-chat",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "rakuten"
-    },
-    {
-      "id": "rakuten/rakutenai-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "rakuten"
-    },
-    {
-      "id": "sarvamai/sarvam-m",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "sarvamai"
-    },
-    {
-      "id": "snowflake/arctic-embed-l",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "snowflake"
-    },
-    {
-      "id": "speakleash/bielik-11b-v2.3-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "speakleash"
-    },
-    {
-      "id": "speakleash/bielik-11b-v2.6-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "speakleash"
-    },
-    {
-      "id": "stepfun-ai/step-3.5-flash",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "stepfun-ai"
-    },
-    {
-      "id": "stockmark/stockmark-2-100b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "stockmark"
-    },
-    {
-      "id": "thudm/chatglm3-6b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "thudm"
-    },
-    {
-      "id": "tiiuae/falcon3-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "tiiuae"
-    },
-    {
-      "id": "tokyotech-llm/llama-3-swallow-70b-instruct-v0.1",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "tokyotech-llm"
-    },
-    {
-      "id": "upstage/solar-10.7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "upstage"
-    },
-    {
-      "id": "utter-project/eurollm-9b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "utter-project"
-    },
-    {
-      "id": "writer/palmyra-creative-122b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "writer"
-    },
-    {
-      "id": "writer/palmyra-fin-70b-32k",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "writer"
-    },
-    {
-      "id": "writer/palmyra-med-70b",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "writer"
-    },
-    {
-      "id": "writer/palmyra-med-70b-32k",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "writer"
-    },
-    {
-      "id": "yentinglin/llama-3-taiwan-70b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "yentinglin"
-    },
-    {
-      "id": "z-ai/glm4.7",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "z-ai"
-    },
-    {
-      "id": "z-ai/glm5",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "z-ai"
-    },
-    {
-      "id": "zyphra/zamba2-7b-instruct",
-      "object": "model",
-      "created": 735790403,
-      "owned_by": "zyphra"
-    }
-  ]
-}

mp1/pluto/__init__.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """Pluto — Real Mode-Switching Pipeline."""
 __version__ = "1.0.0"

+# -*- coding: utf-8 -*-
 """Pluto — Real Mode-Switching Pipeline."""
 __version__ = "1.0.0"

mp1/pluto/bus.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/bus.py — Lightweight in-memory message bus for agent communication.

+# -*- coding: utf-8 -*-
 """
 pluto/bus.py — Lightweight in-memory message bus for agent communication.

mp1/pluto/chunker.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/chunker.py — Chunk classifier (spec §4).

+# -*- coding: utf-8 -*-
 """
 pluto/chunker.py — Chunk classifier (spec §4).

mp1/pluto/db.py ADDED Viewed

	@@ -0,0 +1,66 @@

+# -*- coding: utf-8 -*-
+"""
+Shared lazy PostgreSQL helpers.
+Importing this module does not require PostgreSQL or psycopg2. A connection is
+attempted only when a caller explicitly asks for one.
+"""
+from __future__ import annotations
+import os
+def _get_connection():
+    """Return a PostgreSQL connection, creating schema on first use."""
+    database_url = os.getenv("DATABASE_URL", "").strip()
+    if not database_url:
+        raise EnvironmentError("DATABASE_URL is not set")
+    try:
+        import psycopg2
+    except Exception as exc:
+        raise EnvironmentError("psycopg2 is required for PostgreSQL session memory") from exc
+    conn = psycopg2.connect(database_url)
+    _ensure_schema(conn)
+    return conn
+def _ensure_schema(conn) -> None:
+    with conn.cursor() as cur:
+        cur.execute(
+            """
+            CREATE TABLE IF NOT EXISTS session_memory (
+                session_id TEXT PRIMARY KEY,
+                doc_id TEXT NOT NULL,
+                created_at TIMESTAMP DEFAULT NOW(),
+                compressed_json JSONB NOT NULL,
+                raw_path TEXT
+            );
+            """
+        )
+        cur.execute(
+            """
+            CREATE TABLE IF NOT EXISTS response_signals (
+                id SERIAL PRIMARY KEY,
+                session_id TEXT,
+                query_hash TEXT,
+                signal_type TEXT,
+                created_at TIMESTAMP DEFAULT NOW()
+            );
+            """
+        )
+        cur.execute(
+            """
+            CREATE TABLE IF NOT EXISTS session_graph (
+                source_session TEXT,
+                target_session TEXT,
+                confidence FLOAT,
+                reason TEXT,
+                created_at TIMESTAMP DEFAULT NOW(),
+                PRIMARY KEY (source_session, target_session)
+            );
+            """
+        )
+    conn.commit()

mp1/pluto/dispatcher.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/dispatcher.py — Provider dispatch + NVIDIA helper utilities.
@@ -226,7 +227,7 @@ def _call_nvidia(cfg: ModeConfig, prompt: str) -> str:
     prefix = str(prompt)[:120]
     use_reasoning = any(
         kw in prefix
-        for kw in ["CRITIC:", "JUDGE:", "You are an evidence verification", "challenge each"]
     )
     payload = {

+# -*- coding: utf-8 -*-
 """
 pluto/dispatcher.py — Provider dispatch + NVIDIA helper utilities.
     prefix = str(prompt)[:120]
     use_reasoning = any(
         kw in prefix
+        for kw in ["CRITIC:", "JUDGE:", "You are an evidence checking", "challenge each"]
     )
     payload = {

mp1/pluto/doc_index.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/doc_index.py — In-memory document index with disk persistence.

+# -*- coding: utf-8 -*-
 """
 pluto/doc_index.py — In-memory document index with disk persistence.

mp1/pluto/doc_summary.py ADDED Viewed

	@@ -0,0 +1,173 @@

+# -*- coding: utf-8 -*-
+"""
+Document-level summary storage and context prefix helpers.
+This module is deliberately lazy: importing it does not require provider keys or
+database/network availability. LLM/provider errors are handled inside
+generate_doc_summary with a fallback summary.
+"""
+from __future__ import annotations
+from datetime import datetime, timezone
+import json
+import logging
+from pathlib import Path
+from typing import Any
+from pydantic import BaseModel, Field
+from pluto.utils import extract_json_from_response
+logger = logging.getLogger("pluto")
+SUMMARY_FILENAME = ".doc_summaries.json"
+class DocSummary(BaseModel):
+    doc_id: str
+    title: str = ""
+    domain: str = ""
+    key_claims: list[str] = Field(default_factory=list)
+    structure: list[str] = Field(default_factory=list)
+    open_questions: list[str] = Field(default_factory=list)
+    created_at: str
+def generate_doc_summary(doc_id: str, corpus_dir: str | Path) -> DocSummary:
+    """Generate and persist a document summary, falling back on failure."""
+    corpus_path = Path(corpus_dir)
+    doc_text = _read_document_text(doc_id, corpus_path)
+    created_at = _utc_now()
+    try:
+        raw = _call_summary_llm(doc_id=doc_id, doc_text=doc_text)
+        summary = _parse_summary(doc_id=doc_id, raw=raw, created_at=created_at)
+    except Exception as exc:
+        logger.warning("Failed to generate document summary for %s: %s", doc_id, exc)
+        summary = _fallback_summary(doc_id=doc_id, created_at=created_at)
+    summaries = load_doc_summaries(corpus_path)
+    summaries[doc_id] = summary
+    save_doc_summaries(corpus_path, summaries)
+    return summary
+def load_doc_summary(doc_id: str, corpus_dir: str | Path) -> DocSummary | None:
+    """Load one stored document summary if present."""
+    return load_doc_summaries(corpus_dir).get(doc_id)
+def load_doc_summaries(corpus_dir: str | Path) -> dict[str, DocSummary]:
+    """Load all document summaries from disk."""
+    path = _summary_path(corpus_dir)
+    if not path.exists():
+        return {}
+    try:
+        raw = path.read_text(encoding="utf-8")
+        data = json.loads(raw)
+        return {
+            str(doc_id): DocSummary(**summary_data)
+            for doc_id, summary_data in data.items()
+            if isinstance(summary_data, dict)
+        }
+    except Exception as exc:
+        logger.warning("Failed to load document summaries from %s: %s", path, exc)
+        return {}
+def save_doc_summaries(corpus_dir: str | Path, summaries: dict[str, DocSummary]) -> None:
+    """Persist all document summaries as JSON."""
+    path = _summary_path(corpus_dir)
+    path.parent.mkdir(parents=True, exist_ok=True)
+    data = {doc_id: summary.model_dump() for doc_id, summary in summaries.items()}
+    path.write_text(json.dumps(data, ensure_ascii=False, indent=1), encoding="utf-8")
+def apply_doc_summary_context(chunk_text: str, doc_id: str, corpus_dir: str | Path) -> str:
+    """Prepend stored document context to a chunk, if available."""
+    summary = load_doc_summary(doc_id, corpus_dir)
+    if not summary:
+        logger.warning("No document summary found for %s", doc_id)
+        return chunk_text
+    key_claims = "; ".join(summary.key_claims)
+    prefix = (
+        f"[Document context: {summary.title} | Domain: {summary.domain} | "
+        f"Key claims: {key_claims}]"
+    )
+    return f"{prefix}\n\n{chunk_text}"
+def _call_summary_llm(doc_id: str, doc_text: str) -> str:
+    """Call the configured quick model for summary JSON."""
+    from pluto.dispatcher import dispatch
+    from pluto.modes import get_mode
+    get_mode("MODE_QUICK")
+    prompt = f"""Summarize this document as JSON only.
+Schema:
+{{
+  "title": "short title",
+  "domain": "subject/domain",
+  "key_claims": ["claim1", "claim2"],
+  "structure": ["intro", "methodology", "results", "conclusion"],
+  "open_questions": ["question1"]
+}}
+Document id: {doc_id}
+Document text:
+---
+{doc_text[:14000]}
+---
+"""
+    return dispatch("MODE_QUICK", prompt)
+def _parse_summary(doc_id: str, raw: str, created_at: str) -> DocSummary:
+    data = json.loads(extract_json_from_response(raw))
+    return DocSummary(
+        doc_id=doc_id,
+        title=str(data.get("title", "")),
+        domain=str(data.get("domain", "")),
+        key_claims=_string_list(data.get("key_claims")),
+        structure=_string_list(data.get("structure")),
+        open_questions=_string_list(data.get("open_questions")),
+        created_at=created_at,
+    )
+def _fallback_summary(doc_id: str, created_at: str) -> DocSummary:
+    return DocSummary(
+        doc_id=doc_id,
+        title=doc_id,
+        domain="",
+        key_claims=[],
+        structure=[],
+        open_questions=[],
+        created_at=created_at,
+    )
+def _read_document_text(doc_id: str, corpus_dir: Path) -> str:
+    for ext in (".md", ".txt"):
+        path = corpus_dir / f"{doc_id}{ext}"
+        if path.exists():
+            return path.read_text(encoding="utf-8", errors="replace")
+    return ""
+def _summary_path(corpus_dir: str | Path) -> Path:
+    return Path(corpus_dir) / SUMMARY_FILENAME
+def _string_list(value: Any) -> list[str]:
+    if not isinstance(value, list):
+        return []
+    return [str(item) for item in value if str(item).strip()]
+def _utc_now() -> str:
+    return datetime.now(timezone.utc).isoformat()

mp1/pluto/embedder.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/embedder.py — Semantic chunking via NVIDIA NIM embedding endpoint.

+# -*- coding: utf-8 -*-
 """
 pluto/embedder.py — Semantic chunking via NVIDIA NIM embedding endpoint.

mp1/pluto/extraction_cache.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/extraction_cache.py — Persistent cache for S1 EXTRACT results.

+# -*- coding: utf-8 -*-
 """
 pluto/extraction_cache.py — Persistent cache for S1 EXTRACT results.

mp1/pluto/ingest.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/ingest.py — File ingestion: convert uploaded files to corpus Markdown.
@@ -95,15 +96,30 @@ def ingest_file(
 def _extract_pdf(path: Path) -> str:
-    """Extract text from PDF using PyPDF2."""
-    from PyPDF2 import PdfReader
-    reader = PdfReader(str(path))
     pages = []
-    for i, page in enumerate(reader.pages):
-        text = page.extract_text() or ""
-        if text.strip():
-            pages.append(f"## Page {i + 1}\n\n{text.strip()}")
     return "\n\n".join(pages)
@@ -193,4 +209,3 @@ def _classify_and_tag_chunks(chunks: list[str]) -> list[dict]:
         })
     return result

+# -*- coding: utf-8 -*-
 """
 pluto/ingest.py — File ingestion: convert uploaded files to corpus Markdown.
 def _extract_pdf(path: Path) -> str:
+    """Extract text and tables from PDF using pdfplumber."""
+    import logging
+    import pdfplumber
+    logger = logging.getLogger("pluto")
     pages = []
+    with pdfplumber.open(str(path)) as pdf:
+        for i, page in enumerate(pdf.pages):
+            page_parts = []
+            text = page.extract_text(x_tolerance=2, y_tolerance=2)
+            if text and text.strip():
+                page_parts.append(text.strip())
+            tables = page.extract_tables()
+            for table in tables:
+                if table:
+                    rows = [" | ".join(cell or "" for cell in row) for row in table]
+                    page_parts.append("\n".join(rows))
+            if page_parts:
+                pages.append(f"## Page {i + 1}\n\n" + "\n\n".join(page_parts))
+            else:
+                logger.warning("pdfplumber returned empty text for page %s in %s", i + 1, path.name)
     return "\n\n".join(pages)
         })
     return result

mp1/pluto/models.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/models.py — Pydantic schemas for all 4 pipeline stages + final output.
@@ -10,9 +11,7 @@ import hashlib
 from enum import Enum
 from typing import Optional
-from pydantic import BaseModel, Field, field_validator
-from pluto.utils import coerce_string, coerce_string_list, ensure_list
 # ── Enums ──────────────────────────────────────────────────────────────────────
@@ -65,11 +64,6 @@ class Evidence(BaseModel):
     where: str = ""
     quote: str = Field(default="", max_length=200)
-    @field_validator("doc_id", "chunk_id", "where", "quote", mode="before")
-    @classmethod
-    def _normalize_text_fields(cls, value):
-        return coerce_string(value, default="")
 # ── S0 ROUTE ───────────────────────────────────────────────────────────────────
@@ -77,11 +71,6 @@ class DocScope(BaseModel):
     doc_id: str
     reason: str
-    @field_validator("doc_id", "reason", mode="before")
-    @classmethod
-    def _normalize_doc_scope_fields(cls, value):
-        return coerce_string(value, default="")
 class ChunkPlan(BaseModel):
     doc_id: str
@@ -92,11 +81,6 @@ class ChunkPlan(BaseModel):
     priority: Priority = Priority.MEDIUM
     task: str = ""
-    @field_validator("doc_id", "chunk_id", "where", "task", mode="before")
-    @classmethod
-    def _normalize_chunk_plan_text_fields(cls, value):
-        return coerce_string(value, default="")
 class Budgets(BaseModel):
     max_chunks_to_read: int = 200
@@ -123,27 +107,12 @@ class Claim(BaseModel):
     dependencies: list[str] = Field(default_factory=list)
     evidence: Evidence | None = None
-    @field_validator("claim_id", "text", mode="before")
-    @classmethod
-    def _normalize_claim_text_fields(cls, value):
-        return coerce_string(value, default="")
-    @field_validator("numbers", "entities", "dependencies", mode="before")
-    @classmethod
-    def _normalize_claim_lists(cls, value):
-        return coerce_string_list(value)
 class MathItem(BaseModel):
     expression: str
     interpretation: str = ""
     evidence: Evidence | None = None
-    @field_validator("expression", "interpretation", mode="before")
-    @classmethod
-    def _normalize_math_fields(cls, value):
-        return coerce_string(value, default="")
 class TableItem(BaseModel):
     caption: str = ""
@@ -151,35 +120,12 @@ class TableItem(BaseModel):
     rows: list[list[str]] = Field(default_factory=list)
     evidence: Evidence | None = None
-    @field_validator("caption", mode="before")
-    @classmethod
-    def _normalize_table_caption(cls, value):
-        return coerce_string(value, default="")
-    @field_validator("headers", mode="before")
-    @classmethod
-    def _normalize_table_headers(cls, value):
-        return coerce_string_list(value)
-    @field_validator("rows", mode="before")
-    @classmethod
-    def _normalize_table_rows(cls, value):
-        rows = []
-        for row in ensure_list(value):
-            rows.append(coerce_string_list(row))
-        return [row for row in rows if row]
 class FigureItem(BaseModel):
     caption: str = ""
     description: str = ""
     evidence: Evidence | None = None
-    @field_validator("caption", "description", mode="before")
-    @classmethod
-    def _normalize_figure_fields(cls, value):
-        return coerce_string(value, default="")
 class CodeItem(BaseModel):
     language: str = ""
@@ -187,11 +133,6 @@ class CodeItem(BaseModel):
     description: str = ""
     evidence: Evidence | None = None
-    @field_validator("language", "snippet", "description", mode="before")
-    @classmethod
-    def _normalize_code_fields(cls, value):
-        return coerce_string(value, default="")
 class ExtractedContent(BaseModel):
     claims: list[Claim] = Field(default_factory=list)
@@ -202,11 +143,6 @@ class ExtractedContent(BaseModel):
     code: list[CodeItem] = Field(default_factory=list)
     chunk_summary: str = ""
-    @field_validator("chunk_summary", mode="before")
-    @classmethod
-    def _normalize_chunk_summary(cls, value):
-        return coerce_string(value, default="")
 class ExtractOutput(BaseModel):
     stage: str = "extract"
@@ -225,71 +161,41 @@ class SectionPoint(BaseModel):
     section: str
     points: list[str] = Field(default_factory=list)
-    @field_validator("section", mode="before")
-    @classmethod
-    def _normalize_section_name(cls, value):
-        return coerce_string(value, default="")
-    @field_validator("points", mode="before")
-    @classmethod
-    def _normalize_section_points(cls, value):
-        return coerce_string_list(value)
 class KeyClaim(BaseModel):
     claim: str
     support: ClaimStatus = ClaimStatus.SUPPORTED
     evidence_refs: list[Evidence] = Field(default_factory=list)
-    @field_validator("claim", mode="before")
-    @classmethod
-    def _normalize_key_claim(cls, value):
-        return coerce_string(value, default="")
 class Synthesis(BaseModel):
     answer_outline: list[SectionPoint] = Field(default_factory=list)
     key_claims: list[KeyClaim] = Field(default_factory=list)
     open_gaps: list[str] = Field(default_factory=list)
-    @field_validator("open_gaps", mode="before")
-    @classmethod
-    def _normalize_open_gap_list(cls, value):
-        return coerce_string_list(value)
 class MergeOutput(BaseModel):
     stage: str = "merge"
     synthesis: Synthesis = Field(default_factory=Synthesis)
-# ── S3 VERIFY ──────────────────────────────────────────────────────────────────
 class CheckedClaim(BaseModel):
     claim: str
     status: ClaimStatus
     evidence: list[Evidence] = Field(default_factory=list)
-    @field_validator("claim", mode="before")
-    @classmethod
-    def _normalize_checked_claim(cls, value):
-        return coerce_string(value, default="")
-class Verification(BaseModel):
     checked_claims: list[CheckedClaim] = Field(default_factory=list)
     unsupported_claims: list[str] = Field(default_factory=list)
     required_followups: list[str] = Field(default_factory=list)
-    @field_validator("unsupported_claims", "required_followups", mode="before")
-    @classmethod
-    def _normalize_verification_lists(cls, value):
-        return coerce_string_list(value)
-class VerifyOutput(BaseModel):
-    stage: str = "verify"
-    verification: Verification = Field(default_factory=Verification)
 # ── FINAL OUTPUT ───────────────────────────────────────────────────────────────
@@ -298,21 +204,11 @@ class Section(BaseModel):
     title: str
     content: str
-    @field_validator("title", "content", mode="before")
-    @classmethod
-    def _normalize_section_fields(cls, value):
-        return coerce_string(value, default="")
 class FinalAnswer(BaseModel):
     response: str
     sections: list[Section] = Field(default_factory=list)
-    @field_validator("response", mode="before")
-    @classmethod
-    def _normalize_response(cls, value):
-        return coerce_string(value, default="")
 class FinalEvidence(BaseModel):
     doc_id: str
@@ -321,11 +217,6 @@ class FinalEvidence(BaseModel):
     supports: str = ""
     quote: str = Field(default="", max_length=200)
-    @field_validator("doc_id", "chunk_id", "where", "supports", "quote", mode="before")
-    @classmethod
-    def _normalize_final_evidence_fields(cls, value):
-        return coerce_string(value, default="")
 class TraceSummary(BaseModel):
     real_switching: bool = False
@@ -336,16 +227,6 @@ class TraceSummary(BaseModel):
     search_queries: list[str] = Field(default_factory=list)
     budget_notes: str = ""
-    @field_validator("models_used", "docs_opened", "search_queries", mode="before")
-    @classmethod
-    def _normalize_trace_lists(cls, value):
-        return coerce_string_list(value)
-    @field_validator("budget_notes", mode="before")
-    @classmethod
-    def _normalize_budget_notes(cls, value):
-        return coerce_string(value, default="")
 class FinalOutput(BaseModel):
     final_answer: FinalAnswer = Field(default_factory=FinalAnswer)
@@ -356,11 +237,6 @@ class FinalOutput(BaseModel):
     next_actions: list[str] = Field(default_factory=list)
     bus_messages: list[dict] = Field(default_factory=list)
-    @field_validator("missing_info", "next_actions", mode="before")
-    @classmethod
-    def _normalize_final_output_lists(cls, value):
-        return coerce_string_list(value)
 # ── Helpers ────────────────────────────────────────────────────────────────────

+# -*- coding: utf-8 -*-
 """
 pluto/models.py — Pydantic schemas for all 4 pipeline stages + final output.
 from enum import Enum
 from typing import Optional
+from pydantic import BaseModel, Field
 # ── Enums ──────────────────────────────────────────────────────────────────────
     where: str = ""
     quote: str = Field(default="", max_length=200)
 # ── S0 ROUTE ───────────────────────────────────────────────────────────────────
     doc_id: str
     reason: str
 class ChunkPlan(BaseModel):
     doc_id: str
     priority: Priority = Priority.MEDIUM
     task: str = ""
 class Budgets(BaseModel):
     max_chunks_to_read: int = 200
     dependencies: list[str] = Field(default_factory=list)
     evidence: Evidence | None = None
 class MathItem(BaseModel):
     expression: str
     interpretation: str = ""
     evidence: Evidence | None = None
 class TableItem(BaseModel):
     caption: str = ""
     rows: list[list[str]] = Field(default_factory=list)
     evidence: Evidence | None = None
 class FigureItem(BaseModel):
     caption: str = ""
     description: str = ""
     evidence: Evidence | None = None
 class CodeItem(BaseModel):
     language: str = ""
     description: str = ""
     evidence: Evidence | None = None
 class ExtractedContent(BaseModel):
     claims: list[Claim] = Field(default_factory=list)
     code: list[CodeItem] = Field(default_factory=list)
     chunk_summary: str = ""
 class ExtractOutput(BaseModel):
     stage: str = "extract"
     section: str
     points: list[str] = Field(default_factory=list)
 class KeyClaim(BaseModel):
     claim: str
     support: ClaimStatus = ClaimStatus.SUPPORTED
     evidence_refs: list[Evidence] = Field(default_factory=list)
 class Synthesis(BaseModel):
     answer_outline: list[SectionPoint] = Field(default_factory=list)
     key_claims: list[KeyClaim] = Field(default_factory=list)
     open_gaps: list[str] = Field(default_factory=list)
 class MergeOutput(BaseModel):
     stage: str = "merge"
     synthesis: Synthesis = Field(default_factory=Synthesis)
+# ── S3 EvidenceCheck ──────────────────────────────────────────────────────────────────
 class CheckedClaim(BaseModel):
     claim: str
     status: ClaimStatus
     evidence: list[Evidence] = Field(default_factory=list)
+class EvidenceCheck(BaseModel):
     checked_claims: list[CheckedClaim] = Field(default_factory=list)
     unsupported_claims: list[str] = Field(default_factory=list)
     required_followups: list[str] = Field(default_factory=list)
+class EvidenceCheckOutput(BaseModel):
+    stage: str = "evidence_check"
+    evidence_check: EvidenceCheck = Field(default_factory=EvidenceCheck)
 # ── FINAL OUTPUT ───────────────────────────────────────────────────────────────
     title: str
     content: str
 class FinalAnswer(BaseModel):
     response: str
     sections: list[Section] = Field(default_factory=list)
 class FinalEvidence(BaseModel):
     doc_id: str
     supports: str = ""
     quote: str = Field(default="", max_length=200)
 class TraceSummary(BaseModel):
     real_switching: bool = False
     search_queries: list[str] = Field(default_factory=list)
     budget_notes: str = ""
 class FinalOutput(BaseModel):
     final_answer: FinalAnswer = Field(default_factory=FinalAnswer)
     next_actions: list[str] = Field(default_factory=list)
     bus_messages: list[dict] = Field(default_factory=list)
 # ── Helpers ────────────────────────────────────────────────────────────────────

mp1/pluto/modes.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/modes.py — Real mode switching engine.
@@ -156,15 +157,78 @@ def _build_registry() -> dict[str, ModeConfig]:
                 provider="groq",
             ),
         }
-    else:
-        raise EnvironmentError("Neither NVIDIA_API_KEY nor GROQ_API_KEY is set.")
 MODE_REGISTRY: dict[str, ModeConfig] = _build_registry()
 def is_real_switching() -> bool:
     """True if MODE_QUICK and MODE_REASONING use DIFFERENT model_ids."""
     quick = MODE_REGISTRY["MODE_QUICK"].model_id
     reasoning = MODE_REGISTRY["MODE_REASONING"].model_id
     return quick != reasoning
@@ -174,4 +238,12 @@ def get_mode(mode_name: str) -> ModeConfig:
     """Look up a mode config by name."""
     if mode_name not in MODE_REGISTRY:
         raise ValueError(f"Unknown mode: {mode_name}. Valid: {list(MODE_REGISTRY)}")
-    return MODE_REGISTRY[mode_name]

+# -*- coding: utf-8 -*-
 """
 pluto/modes.py — Real mode switching engine.
                 provider="groq",
             ),
         }
+    return _build_unconfigured_registry()
+def _build_unconfigured_registry() -> dict[str, ModeConfig]:
+    """Return placeholder modes so imports work without provider credentials."""
+    return {
+        "MODE_QUICK": ModeConfig(
+            mode_name="MODE_QUICK",
+            model_id="unconfigured/MODE_QUICK",
+            temperature=0.1,
+            max_tokens=1024,
+            compute_profile="unconfigured",
+            provider="unconfigured",
+        ),
+        "MODE_REASONING": ModeConfig(
+            mode_name="MODE_REASONING",
+            model_id="unconfigured/MODE_REASONING",
+            temperature=0.3,
+            max_tokens=4096,
+            compute_profile="unconfigured",
+            provider="unconfigured",
+        ),
+        "MODE_VISION": ModeConfig(
+            mode_name="MODE_VISION",
+            model_id="unconfigured/MODE_VISION",
+            temperature=0.1,
+            max_tokens=4096,
+            compute_profile="unconfigured",
+            provider="unconfigured",
+        ),
+        "MODE_ULTRA": ModeConfig(
+            mode_name="MODE_ULTRA",
+            model_id="unconfigured/MODE_ULTRA",
+            temperature=0.2,
+            max_tokens=4096,
+            compute_profile="unconfigured",
+            provider="unconfigured",
+        ),
+        "MODE_GEMINI": ModeConfig(
+            mode_name="MODE_GEMINI",
+            model_id="unconfigured/MODE_GEMINI",
+            temperature=0.0,
+            max_tokens=4096,
+            compute_profile="unconfigured",
+            provider="unconfigured",
+        ),
+    }
 MODE_REGISTRY: dict[str, ModeConfig] = _build_registry()
+def _missing_provider_error() -> EnvironmentError:
+    return EnvironmentError("Neither NVIDIA_API_KEY nor GROQ_API_KEY is set.")
+def _is_unconfigured() -> bool:
+    return any(mode.provider == "unconfigured" for mode in MODE_REGISTRY.values())
+def _refresh_mode_registry() -> None:
+    """Refresh mode config in place so imported MODE_REGISTRY references stay valid."""
+    MODE_REGISTRY.clear()
+    MODE_REGISTRY.update(_build_registry())
 def is_real_switching() -> bool:
     """True if MODE_QUICK and MODE_REASONING use DIFFERENT model_ids."""
+    if _is_unconfigured():
+        _refresh_mode_registry()
+    if _is_unconfigured():
+        return False
     quick = MODE_REGISTRY["MODE_QUICK"].model_id
     reasoning = MODE_REGISTRY["MODE_REASONING"].model_id
     return quick != reasoning
     """Look up a mode config by name."""
     if mode_name not in MODE_REGISTRY:
         raise ValueError(f"Unknown mode: {mode_name}. Valid: {list(MODE_REGISTRY)}")
+    mode = MODE_REGISTRY[mode_name]
+    if mode.provider == "unconfigured":
+        _refresh_mode_registry()
+        mode = MODE_REGISTRY.get(mode_name)
+    if mode is None:
+        raise ValueError(f"Unknown mode: {mode_name}. Valid: {list(MODE_REGISTRY)}")
+    if mode.provider == "unconfigured":
+        raise _missing_provider_error()
+    return mode

mp1/pluto/pipeline.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/pipeline.py - Orchestrator for document understanding and query answering.
@@ -23,7 +24,7 @@ from pluto.modes import is_real_switching
 from pluto.stages.extract import run_extract
 from pluto.stages.merge import run_merge
 from pluto.stages.route import run_route
-from pluto.stages.verify import run_verify
 from pluto.tools import CorpusTools
 from pluto.tracer import Tracer
@@ -31,9 +32,16 @@ from pluto.tracer import Tracer
 class PipelineRunner:
     """Two-phase pipeline: understand documents, then answer queries."""
-    def __init__(self, corpus_dir: str, output_dir: str = "./output", doc_index=None) -> None:
         self.tracer = Tracer()
         self.doc_index = doc_index
         self.tools = CorpusTools(corpus_dir, output_dir, self.tracer, doc_index=doc_index)
         self.cache = ExtractionCache(corpus_dir)
         self._progress_callback: Any = None
@@ -72,9 +80,11 @@ class PipelineRunner:
         self._ensure_docs_understood(selected_doc_ids=selected_doc_ids)
         self._emit("route", {"status": "running", "query": query})
         route_out = run_route(
-            query,
             self.tools,
             self.tracer,
             bus=self.bus,
@@ -137,22 +147,22 @@ class PipelineRunner:
             },
         )
-        self._emit("verify", {"status": "running"})
-        verify_out = run_verify(merge_out, extractions, self.tracer, bus=self.bus)
         self._emit(
-            "verify",
             {
                 "status": "complete",
-                "checked": len(verify_out.verification.checked_claims),
-                "unsupported": len(verify_out.verification.unsupported_claims),
-                "gaps": len(verify_out.verification.required_followups),
             },
         )
         final = self._build_final(
             query,
             merge_out,
-            verify_out,
             extractions,
             overview=overview,
             bus=self.bus,
@@ -187,7 +197,7 @@ class PipelineRunner:
         self,
         query,
         merge_out,
-        verify_out,
         extractions,
         overview="",
         bus: MessageBus | None = None,
@@ -203,12 +213,12 @@ class PipelineRunner:
             sections.append(Section(title=section_point.section, content=content))
         section_parts = [f"**{section.title}**\n{section.content}" for section in sections if section.content]
-        verified_claims = [
             checked
-            for checked in verify_out.verification.checked_claims
             if checked.status == ClaimStatus.SUPPORTED
         ]
-        claim_parts = [checked.claim for checked in verified_claims]
         if section_parts:
             response = "\n\n".join(section_parts)
@@ -232,15 +242,15 @@ class PipelineRunner:
                     )
                 )
-        total = len(verify_out.verification.checked_claims)
         supported = sum(
             1
-            for checked in verify_out.verification.checked_claims
             if checked.status == ClaimStatus.SUPPORTED
         )
         uncertain = sum(
             1
-            for checked in verify_out.verification.checked_claims
             if checked.status == ClaimStatus.UNCERTAIN
         )
@@ -269,8 +279,8 @@ class PipelineRunner:
             evidence=evidence,
             trace_summary=trace,
             confidence=confidence,
-            missing_info=merge_out.synthesis.open_gaps + verify_out.verification.required_followups,
-            next_actions=verify_out.verification.required_followups,
             bus_messages=bus_messages,
         )
@@ -289,3 +299,24 @@ def _normalize_selected_doc_ids(selected_doc_ids: list[str] | None) -> list[str]
 def _normalize_detail_level(detail_level: str | None) -> str:
     return "detailed" if str(detail_level or "").strip().lower() == "detailed" else "standard"

+# -*- coding: utf-8 -*-
 """
 pluto/pipeline.py - Orchestrator for document understanding and query answering.
 from pluto.stages.extract import run_extract
 from pluto.stages.merge import run_merge
 from pluto.stages.route import run_route
+from pluto.stages.evidence_check import run_evidence_check
 from pluto.tools import CorpusTools
 from pluto.tracer import Tracer
 class PipelineRunner:
     """Two-phase pipeline: understand documents, then answer queries."""
+    def __init__(
+        self,
+        corpus_dir: str,
+        output_dir: str = "./output",
+        doc_index=None,
+        prior_session_context: list[dict] | None = None,
+    ) -> None:
         self.tracer = Tracer()
         self.doc_index = doc_index
+        self.prior_session_context = prior_session_context or []
         self.tools = CorpusTools(corpus_dir, output_dir, self.tracer, doc_index=doc_index)
         self.cache = ExtractionCache(corpus_dir)
         self._progress_callback: Any = None
         self._ensure_docs_understood(selected_doc_ids=selected_doc_ids)
+        route_query = _prepend_prior_session_context(query, self.prior_session_context)
         self._emit("route", {"status": "running", "query": query})
         route_out = run_route(
+            route_query,
             self.tools,
             self.tracer,
             bus=self.bus,
             },
         )
+        self._emit("evidence_check", {"status": "running"})
+        evidence_check_out = run_evidence_check(merge_out, extractions, self.tracer, bus=self.bus)
         self._emit(
+            "evidence_check",
             {
                 "status": "complete",
+                "checked": len(evidence_check_out.evidence_check.checked_claims),
+                "unsupported": len(evidence_check_out.evidence_check.unsupported_claims),
+                "gaps": len(evidence_check_out.evidence_check.required_followups),
             },
         )
         final = self._build_final(
             query,
             merge_out,
+            evidence_check_out,
             extractions,
             overview=overview,
             bus=self.bus,
         self,
         query,
         merge_out,
+        evidence_check_out,
         extractions,
         overview="",
         bus: MessageBus | None = None,
             sections.append(Section(title=section_point.section, content=content))
         section_parts = [f"**{section.title}**\n{section.content}" for section in sections if section.content]
+        supported_checked_claims = [
             checked
+            for checked in evidence_check_out.evidence_check.checked_claims
             if checked.status == ClaimStatus.SUPPORTED
         ]
+        claim_parts = [checked.claim for checked in supported_checked_claims]
         if section_parts:
             response = "\n\n".join(section_parts)
                     )
                 )
+        total = len(evidence_check_out.evidence_check.checked_claims)
         supported = sum(
             1
+            for checked in evidence_check_out.evidence_check.checked_claims
             if checked.status == ClaimStatus.SUPPORTED
         )
         uncertain = sum(
             1
+            for checked in evidence_check_out.evidence_check.checked_claims
             if checked.status == ClaimStatus.UNCERTAIN
         )
             evidence=evidence,
             trace_summary=trace,
             confidence=confidence,
+            missing_info=merge_out.synthesis.open_gaps + evidence_check_out.evidence_check.required_followups,
+            next_actions=evidence_check_out.evidence_check.required_followups,
             bus_messages=bus_messages,
         )
 def _normalize_detail_level(detail_level: str | None) -> str:
     return "detailed" if str(detail_level or "").strip().lower() == "detailed" else "standard"
+def _prepend_prior_session_context(query: str, prior_session_context: list[dict]) -> str:
+    key_findings: list[str] = []
+    open_questions: list[str] = []
+    for session in prior_session_context or []:
+        key_findings.extend(str(item) for item in session.get("key_findings", []) if str(item).strip())
+        open_questions.extend(str(item) for item in session.get("open_questions", []) if str(item).strip())
+    if not key_findings and not open_questions:
+        return query
+    findings_block = "\n".join(f"- {finding}" for finding in key_findings[:10])
+    questions_block = "\n".join(f"- {question}" for question in open_questions[:10])
+    return (
+        "[Prior session findings for this document:\n"
+        f"{findings_block}\n"
+        "Open questions from prior sessions:\n"
+        f"{questions_block}]\n\n"
+        f"{query}"
+    )

mp1/pluto/server.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/server.py — FastAPI server bridging pipeline <-> web UI.
@@ -17,6 +18,7 @@ import json
 import os
 import shutil
 import tempfile
 from pathlib import Path
 from typing import Any
@@ -33,8 +35,10 @@ app = FastAPI(title="Pluto Pipeline", version="1.0.0")
 # ── State ─────────────────────────────────────────────────────────────────────
-_progress_queue: asyncio.Queue = asyncio.Queue()  # Always exists — reset per run
-_latest_result: dict | None = None
 FRONTEND_DIR = Path(__file__).parent.parent / "frontend"
 CORPUS_DIR = Path(__file__).parent.parent / "corpus"
@@ -85,6 +89,69 @@ def _json_safe(value: Any) -> Any:
     return jsonable_encoder(value)
 # ── Startup: re-index existing corpus files ─────────────────────────────────
 @app.on_event("startup")
@@ -142,16 +209,34 @@ async def index():
 @app.post("/api/run")
 async def run_pipeline(request: Request):
     """Run the full pipeline for a user query."""
-    global _latest_result, _progress_queue
     body = await request.json()
     query = body.get("query", "")
     corpus_dir = body.get("corpus_dir", str(CORPUS_DIR))
     selected_doc_ids = _normalize_selected_doc_ids(body.get("selected_doc_ids"))
     detail_level = _normalize_detail_level(body.get("detail_level"))
     if not query:
-        return JSONResponse({"error": "No query provided"}, status_code=400)
     processing_docs = _processing_docs_for_scope(_doc_index, selected_doc_ids)
     if processing_docs:
@@ -159,26 +244,28 @@ async def run_pipeline(request: Request):
             {
                 "error": "Please wait for document understanding to finish before running a query.",
                 "processing_docs": processing_docs,
             },
             status_code=409,
             headers={"Cache-Control": "no-store"},
         )
     # Reset queue for this run (drain any leftover events without replacing the object)
-    while not _progress_queue.empty():
         try:
-            _progress_queue.get_nowait()
         except asyncio.QueueEmpty:
             break
     def progress_callback(stage: str, data: dict):
-        _progress_queue.put_nowait(_json_safe({"stage": stage, **data}))
     # Run pipeline in a thread to avoid blocking
     loop = asyncio.get_event_loop()
     runner = PipelineRunner(
         corpus_dir=corpus_dir, output_dir=str(OUTPUT_DIR),
         doc_index=_doc_index,
     )
     runner.on_progress(progress_callback)
@@ -192,17 +279,20 @@ async def run_pipeline(request: Request):
                 detail_level=detail_level,
             ),
         )
-        _latest_result = result.model_dump()
         # Include cache stats in the response
         cache_stats = runner.cache.stats()
-        _latest_result["cache_hits"] = cache_stats["hits"]
-        _latest_result["cache_misses"] = cache_stats["misses"]
         # Signal completion
-        await _progress_queue.put({"stage": "done", "status": "complete"})
-        return JSONResponse(_latest_result)
     except Exception as e:
         import traceback
@@ -211,31 +301,39 @@ async def run_pipeline(request: Request):
         # Always signal error to SSE stream
         try:
-            await _progress_queue.put({"stage": "error", "status": "failed", "detail": err_msg})
         except Exception:
             pass
         # ALWAYS return valid JSON — never let FastAPI return HTML 500
         return JSONResponse(
-            {"error": f"Pipeline error: {err_msg}"},
             status_code=200  # Return 200 so browser can parse the JSON body
         )
 @app.get("/api/stream")
-async def stream_progress():
     """SSE stream of pipeline progress events."""
     async def event_generator():
         # Wait for events from the pipeline — keep connection open
-        while True:
-            try:
-                event = await asyncio.wait_for(_progress_queue.get(), timeout=120.0)
-                yield f"data: {json.dumps(_json_safe(event))}\n\n"
-                if event.get("stage") in ("done", "error"):
-                    break
-            except asyncio.TimeoutError:
-                yield f"data: {json.dumps({'stage': 'heartbeat'})}\n\n"
     return StreamingResponse(
         event_generator(),
@@ -249,11 +347,21 @@ async def stream_progress():
 @app.get("/api/result")
-async def get_result():
-    """Return the latest pipeline result."""
-    if _latest_result:
-        return JSONResponse(_latest_result)
-    return JSONResponse({"error": "No result yet"}, status_code=404)
 @app.post("/api/compare")
@@ -296,6 +404,30 @@ async def benchmark_compare(request: Request):
         )
 # ── File upload ───────────────────────────────────────────────────────────────
 ALLOWED_EXTENSIONS = {".pdf", ".docx", ".doc", ".txt", ".md", ".markdown"}
@@ -337,6 +469,8 @@ async def upload_files(files: list[UploadFile] = File(...)):
                         tracer = Tracer()
                         print(f"  [SERVER] Starting background Phase A for {did}...")
                         run_understand(did, _doc_index, tracer)
                         print(f"  [SERVER] Background Phase A COMPLETE for {did}")
                     except BaseException as e:
                         import traceback

+# -*- coding: utf-8 -*-
 """
 pluto/server.py — FastAPI server bridging pipeline <-> web UI.
 import os
 import shutil
 import tempfile
+from uuid import uuid4
 from pathlib import Path
 from typing import Any
 # ── State ─────────────────────────────────────────────────────────────────────
+session_queues: dict[str, asyncio.Queue] = {}
+session_results: dict[str, dict] = {}
+session_cleanup_tasks: dict[str, asyncio.Task] = {}
+SESSION_CLEANUP_DELAY_SECONDS = 300
 FRONTEND_DIR = Path(__file__).parent.parent / "frontend"
 CORPUS_DIR = Path(__file__).parent.parent / "corpus"
     return jsonable_encoder(value)
+def _normalize_session_id(raw_value: Any) -> str:
+    session_id = str(raw_value or "").strip()
+    return session_id or str(uuid4())
+def _get_session_queue(session_id: str) -> asyncio.Queue:
+    cleanup_task = session_cleanup_tasks.pop(session_id, None)
+    if cleanup_task:
+        cleanup_task.cancel()
+    queue = session_queues.get(session_id)
+    if queue is None:
+        queue = asyncio.Queue()
+        session_queues[session_id] = queue
+    return queue
+def _schedule_session_cleanup(session_id: str, queue: asyncio.Queue) -> None:
+    cleanup_task = session_cleanup_tasks.pop(session_id, None)
+    if cleanup_task:
+        cleanup_task.cancel()
+    async def cleanup_later() -> None:
+        try:
+            await asyncio.sleep(SESSION_CLEANUP_DELAY_SECONDS)
+            if session_queues.get(session_id) is queue:
+                session_queues.pop(session_id, None)
+                session_results.pop(session_id, None)
+        except asyncio.CancelledError:
+            pass
+        finally:
+            if session_cleanup_tasks.get(session_id) is task:
+                session_cleanup_tasks.pop(session_id, None)
+    task = asyncio.create_task(cleanup_later())
+    session_cleanup_tasks[session_id] = task
+def _session_doc_id(selected_doc_ids: list[str], result_data: dict | None = None) -> str:
+    if selected_doc_ids:
+        return selected_doc_ids[0]
+    trace = (result_data or {}).get("trace_summary", {})
+    docs_opened = trace.get("docs_opened", []) if isinstance(trace, dict) else []
+    if docs_opened:
+        return str(docs_opened[0])
+    return "corpus"
+def _schedule_session_compression(session_id: str) -> None:
+    result_data = session_results.get(session_id)
+    if not result_data:
+        return
+    doc_id = str(result_data.get("doc_id") or "corpus")
+    async def compress_later() -> None:
+        from pluto.session_memory import compress_session
+        await asyncio.to_thread(compress_session, session_id, doc_id, result_data, CORPUS_DIR)
+    asyncio.create_task(compress_later())
 # ── Startup: re-index existing corpus files ─────────────────────────────────
 @app.on_event("startup")
 @app.post("/api/run")
 async def run_pipeline(request: Request):
     """Run the full pipeline for a user query."""
     body = await request.json()
     query = body.get("query", "")
     corpus_dir = body.get("corpus_dir", str(CORPUS_DIR))
     selected_doc_ids = _normalize_selected_doc_ids(body.get("selected_doc_ids"))
     detail_level = _normalize_detail_level(body.get("detail_level"))
+    session_id = _normalize_session_id(body.get("session_id"))
+    query_timestamp = body.get("query_timestamp")
+    prev_query = body.get("prev_query", "")
+    prev_query_timestamp = body.get("prev_query_timestamp")
+    prev_session_id = str(body.get("prev_session_id") or "").strip()
+    progress_queue = _get_session_queue(session_id)
+    doc_id = _session_doc_id(selected_doc_ids)
+    prior_session_context = []
+    if selected_doc_ids:
+        from pluto.session_memory import list_session_context
+        prior_session_context = list_session_context(doc_id, CORPUS_DIR)
     if not query:
+        return JSONResponse({"error": "No query provided", "session_id": session_id}, status_code=400)
+    _capture_behavioral_signals(
+        query=query,
+        query_timestamp=query_timestamp,
+        prev_query=prev_query,
+        prev_query_timestamp=prev_query_timestamp,
+        prev_session_id=prev_session_id,
+        fallback_session_id=session_id,
+    )
     processing_docs = _processing_docs_for_scope(_doc_index, selected_doc_ids)
     if processing_docs:
             {
                 "error": "Please wait for document understanding to finish before running a query.",
                 "processing_docs": processing_docs,
+                "session_id": session_id,
             },
             status_code=409,
             headers={"Cache-Control": "no-store"},
         )
     # Reset queue for this run (drain any leftover events without replacing the object)
+    while not progress_queue.empty():
         try:
+            progress_queue.get_nowait()
         except asyncio.QueueEmpty:
             break
     def progress_callback(stage: str, data: dict):
+        progress_queue.put_nowait(_json_safe({"stage": stage, **data}))
     # Run pipeline in a thread to avoid blocking
     loop = asyncio.get_event_loop()
     runner = PipelineRunner(
         corpus_dir=corpus_dir, output_dir=str(OUTPUT_DIR),
         doc_index=_doc_index,
+        prior_session_context=prior_session_context,
     )
     runner.on_progress(progress_callback)
                 detail_level=detail_level,
             ),
         )
+        session_results[session_id] = result.model_dump()
         # Include cache stats in the response
         cache_stats = runner.cache.stats()
+        session_results[session_id]["cache_hits"] = cache_stats["hits"]
+        session_results[session_id]["cache_misses"] = cache_stats["misses"]
+        session_results[session_id]["session_id"] = session_id
+        session_results[session_id]["query"] = query
+        session_results[session_id]["doc_id"] = _session_doc_id(selected_doc_ids, session_results[session_id])
         # Signal completion
+        await progress_queue.put({"stage": "done", "status": "complete", "session_id": session_id})
+        return JSONResponse(session_results[session_id])
     except Exception as e:
         import traceback
         # Always signal error to SSE stream
         try:
+            await progress_queue.put(
+                {"stage": "error", "status": "failed", "detail": err_msg, "session_id": session_id}
+            )
         except Exception:
             pass
         # ALWAYS return valid JSON — never let FastAPI return HTML 500
         return JSONResponse(
+            {"error": f"Pipeline error: {err_msg}", "session_id": session_id},
             status_code=200  # Return 200 so browser can parse the JSON body
         )
 @app.get("/api/stream")
+async def stream_progress(session_id: str):
     """SSE stream of pipeline progress events."""
+    progress_queue = _get_session_queue(session_id)
     async def event_generator():
         # Wait for events from the pipeline — keep connection open
+        try:
+            while True:
+                try:
+                    event = await asyncio.wait_for(progress_queue.get(), timeout=120.0)
+                    yield f"data: {json.dumps(_json_safe(event))}\n\n"
+                    if event.get("stage") in ("done", "error"):
+                        if event.get("stage") == "done":
+                            _schedule_session_compression(session_id)
+                        break
+                except asyncio.TimeoutError:
+                    yield f"data: {json.dumps({'stage': 'heartbeat', 'session_id': session_id})}\n\n"
+        finally:
+            _schedule_session_cleanup(session_id, progress_queue)
     return StreamingResponse(
         event_generator(),
 @app.get("/api/result")
+async def get_result(session_id: str):
+    """Return the latest pipeline result for a session."""
+    result = session_results.get(session_id)
+    if result:
+        return JSONResponse(result)
+    return JSONResponse({"error": "No result yet", "session_id": session_id}, status_code=404)
+@app.get("/api/session-context/{doc_id}")
+async def get_session_context(doc_id: str):
+    """Return recent compressed session context for a document."""
+    from pluto.session_memory import list_session_context
+    sessions = list_session_context(doc_id, CORPUS_DIR, limit=10)
+    return JSONResponse({"doc_id": doc_id, "sessions": sessions}, headers={"Cache-Control": "no-store"})
 @app.post("/api/compare")
         )
+def _capture_behavioral_signals(
+    query: str,
+    query_timestamp: Any,
+    prev_query: str,
+    prev_query_timestamp: Any,
+    prev_session_id: str,
+    fallback_session_id: str,
+) -> None:
+    from pluto.signal_logger import check_prior_reference, check_rephrase, log_signal, query_hash
+    referenced_session_id = prev_session_id or fallback_session_id
+    if prev_query and prev_query_timestamp is not None and query_timestamp is not None:
+        try:
+            delta_seconds = (float(query_timestamp) - float(prev_query_timestamp)) / 1000.0
+        except (TypeError, ValueError):
+            delta_seconds = -1
+        if check_rephrase(query, prev_query, delta_seconds):
+            log_signal(referenced_session_id, query_hash(prev_query), "rephrase_fail")
+    if check_prior_reference(query):
+        log_signal(referenced_session_id, query_hash(query), "prior_reference")
 # ── File upload ───────────────────────────────────────────────────────────────
 ALLOWED_EXTENSIONS = {".pdf", ".docx", ".doc", ".txt", ".md", ".markdown"}
                         tracer = Tracer()
                         print(f"  [SERVER] Starting background Phase A for {did}...")
                         run_understand(did, _doc_index, tracer)
+                        from pluto.doc_summary import generate_doc_summary
+                        generate_doc_summary(did, CORPUS_DIR)
                         print(f"  [SERVER] Background Phase A COMPLETE for {did}")
                     except BaseException as e:
                         import traceback

mp1/pluto/session_memory.py ADDED Viewed

	@@ -0,0 +1,230 @@

+# -*- coding: utf-8 -*-
+"""
+Session memory compression and retrieval.
+PostgreSQL is initialized lazily. If it is not configured or unavailable, writes
+fall back to local JSON files under the corpus directory.
+"""
+from __future__ import annotations
+from datetime import datetime, timezone
+import json
+import logging
+from pathlib import Path
+from typing import Any
+from pydantic import BaseModel, Field
+from pluto.db import _get_connection
+from pluto.utils import extract_json_from_response
+logger = logging.getLogger("pluto")
+LOCAL_MEMORY_DIR = ".session_memory"
+RAW_ARCHIVE_DIR = ".session_archive"
+class CompressedSession(BaseModel):
+    session_id: str
+    doc_id: str
+    timestamp: str
+    queries_resolved: list[dict] = Field(default_factory=list)
+    key_findings: list[str] = Field(default_factory=list)
+    open_questions: list[str] = Field(default_factory=list)
+    links_to_prior_sessions: list[str] = Field(default_factory=list)
+def compress_session(
+    session_id: str,
+    doc_id: str,
+    session_result: dict,
+    corpus_dir: str | Path,
+) -> CompressedSession:
+    """Compress and store a session result without raising on storage failure."""
+    corpus_path = Path(corpus_dir)
+    raw_path = _write_raw_session(corpus_path, session_id, session_result)
+    try:
+        raw = _call_compression_llm(session_id=session_id, doc_id=doc_id, session_result=session_result)
+        compressed = _parse_compressed_session(session_id, doc_id, raw)
+    except Exception as exc:
+        logger.warning("Session compression LLM failed for %s: %s", session_id, exc)
+        compressed = _fallback_compressed_session(session_id, doc_id, session_result)
+    try:
+        _store_postgres(compressed, raw_path)
+    except Exception as exc:
+        logger.warning("PostgreSQL session memory unavailable; writing local fallback: %s", exc)
+        _store_local(corpus_path, compressed)
+    return compressed
+def list_session_context(
+    doc_id: str,
+    corpus_dir: str | Path,
+    limit: int = 10,
+) -> list[dict]:
+    """Return compressed sessions for one document, newest first."""
+    try:
+        return _list_postgres(doc_id, limit)
+    except Exception as exc:
+        logger.warning("PostgreSQL session memory unavailable; reading local fallback: %s", exc)
+        return _list_local(Path(corpus_dir), doc_id, limit)
+def _call_compression_llm(session_id: str, doc_id: str, session_result: dict) -> str:
+    from pluto.dispatcher import dispatch
+    from pluto.modes import get_mode
+    get_mode("MODE_QUICK")
+    prompt = f"""Compress this QA session as JSON only.
+Schema:
+{{
+  "queries_resolved": [
+    {{"query": "...", "answer_summary": "...", "chunks_used": 0, "confidence": 0.0}}
+  ],
+  "key_findings": ["finding"],
+  "open_questions": ["question"],
+  "links_to_prior_sessions": []
+}}
+Session id: {session_id}
+Document id: {doc_id}
+Session result:
+{json.dumps(session_result, ensure_ascii=False)[:14000]}
+"""
+    return dispatch("MODE_QUICK", prompt)
+def _parse_compressed_session(session_id: str, doc_id: str, raw: str) -> CompressedSession:
+    data = json.loads(extract_json_from_response(raw))
+    return CompressedSession(
+        session_id=session_id,
+        doc_id=doc_id,
+        timestamp=_utc_now(),
+        queries_resolved=data.get("queries_resolved", []) if isinstance(data.get("queries_resolved"), list) else [],
+        key_findings=_string_list(data.get("key_findings")),
+        open_questions=_string_list(data.get("open_questions")),
+        links_to_prior_sessions=_string_list(data.get("links_to_prior_sessions")),
+    )
+def _fallback_compressed_session(session_id: str, doc_id: str, session_result: dict) -> CompressedSession:
+    final_answer = session_result.get("final_answer", {}) if isinstance(session_result, dict) else {}
+    trace = session_result.get("trace_summary", {}) if isinstance(session_result, dict) else {}
+    query = session_result.get("query", "") if isinstance(session_result, dict) else ""
+    answer = final_answer.get("response", "") if isinstance(final_answer, dict) else ""
+    return CompressedSession(
+        session_id=session_id,
+        doc_id=doc_id,
+        timestamp=_utc_now(),
+        queries_resolved=[
+            {
+                "query": query,
+                "answer_summary": str(answer)[:500],
+                "chunks_used": trace.get("chunks_processed", 0) if isinstance(trace, dict) else 0,
+                "confidence": session_result.get("confidence", 0.0) if isinstance(session_result, dict) else 0.0,
+            }
+        ],
+        key_findings=[],
+        open_questions=session_result.get("missing_info", []) if isinstance(session_result, dict) else [],
+        links_to_prior_sessions=[],
+    )
+def _store_postgres(compressed: CompressedSession, raw_path: str) -> None:
+    conn = _get_connection()
+    try:
+        with conn.cursor() as cur:
+            cur.execute(
+                """
+                INSERT INTO session_memory (session_id, doc_id, compressed_json, raw_path)
+                VALUES (%s, %s, %s::jsonb, %s)
+                ON CONFLICT (session_id) DO UPDATE SET
+                    doc_id = EXCLUDED.doc_id,
+                    compressed_json = EXCLUDED.compressed_json,
+                    raw_path = EXCLUDED.raw_path
+                """,
+                (
+                    compressed.session_id,
+                    compressed.doc_id,
+                    json.dumps(compressed.model_dump(), ensure_ascii=False),
+                    raw_path,
+                ),
+            )
+        conn.commit()
+    finally:
+        conn.close()
+def _list_postgres(doc_id: str, limit: int) -> list[dict]:
+    conn = _get_connection()
+    try:
+        with conn.cursor() as cur:
+            cur.execute(
+                """
+                SELECT compressed_json
+                FROM session_memory
+                WHERE doc_id = %s
+                ORDER BY created_at DESC
+                LIMIT %s
+                """,
+                (doc_id, limit),
+            )
+            rows = cur.fetchall()
+    finally:
+        conn.close()
+    results = []
+    for row in rows:
+        value = row[0]
+        if isinstance(value, str):
+            value = json.loads(value)
+        results.append(value)
+    return results
+def _store_local(corpus_dir: Path, compressed: CompressedSession) -> None:
+    memory_dir = corpus_dir / LOCAL_MEMORY_DIR
+    memory_dir.mkdir(parents=True, exist_ok=True)
+    path = memory_dir / f"{compressed.session_id}.json"
+    path.write_text(json.dumps(compressed.model_dump(), ensure_ascii=False, indent=1), encoding="utf-8")
+def _list_local(corpus_dir: Path, doc_id: str, limit: int) -> list[dict]:
+    memory_dir = corpus_dir / LOCAL_MEMORY_DIR
+    if not memory_dir.exists():
+        return []
+    sessions = []
+    for path in memory_dir.glob("*.json"):
+        try:
+            data = json.loads(path.read_text(encoding="utf-8"))
+        except Exception:
+            continue
+        if data.get("doc_id") == doc_id:
+            sessions.append(data)
+    sessions.sort(key=lambda item: item.get("timestamp", ""), reverse=True)
+    return sessions[:limit]
+def _write_raw_session(corpus_dir: Path, session_id: str, session_result: dict) -> str:
+    archive_dir = corpus_dir / RAW_ARCHIVE_DIR
+    archive_dir.mkdir(parents=True, exist_ok=True)
+    path = archive_dir / f"{session_id}.json"
+    path.write_text(json.dumps(session_result, ensure_ascii=False, indent=1), encoding="utf-8")
+    return str(path)
+def _string_list(value: Any) -> list[str]:
+    if not isinstance(value, list):
+        return []
+    return [str(item) for item in value if str(item).strip()]
+def _utc_now() -> str:
+    return datetime.now(timezone.utc).isoformat()

mp1/pluto/signal_logger.py ADDED Viewed

	@@ -0,0 +1,93 @@

+# -*- coding: utf-8 -*-
+"""
+Behavioral response signal capture.
+This module is lazy: importing it does not require PostgreSQL, embeddings, or
+provider credentials. Missing resources cause the specific signal operation to
+skip or log a warning rather than crashing request handling.
+"""
+from __future__ import annotations
+import hashlib
+import logging
+from pluto.db import _get_connection
+logger = logging.getLogger("pluto")
+REPHRASE_SECONDS = 90
+REPHRASE_SIMILARITY = 0.75
+PRIOR_REFERENCE_PHRASES = (
+    "as you said",
+    "earlier you mentioned",
+    "based on your answer",
+    "you mentioned",
+    "from your previous",
+)
+def query_hash(query: str) -> str:
+    return hashlib.sha256(str(query or "").encode("utf-8")).hexdigest()
+def log_signal(session_id: str, query_hash: str, signal_type: str) -> None:
+    """Write one response signal row if PostgreSQL is available."""
+    if not session_id or not query_hash or not signal_type:
+        return
+    try:
+        conn = _get_connection()
+        try:
+            with conn.cursor() as cur:
+                cur.execute(
+                    """
+                    INSERT INTO response_signals (session_id, query_hash, signal_type)
+                    VALUES (%s, %s, %s)
+                    """,
+                    (session_id, query_hash, signal_type),
+                )
+            conn.commit()
+        finally:
+            conn.close()
+    except Exception as exc:
+        logger.warning("Failed to log response signal %s for %s: %s", signal_type, session_id, exc)
+def check_rephrase(current_query: str, prev_query: str, time_delta_seconds: float) -> bool:
+    """Return True when a near-repeat query arrives soon after a prior response."""
+    if not current_query or not prev_query:
+        return False
+    if time_delta_seconds < 0 or time_delta_seconds > REPHRASE_SECONDS:
+        return False
+    try:
+        current_embedding = _embed_query(current_query)
+        prev_embedding = _embed_query(prev_query)
+        return _cosine_similarity(current_embedding, prev_embedding) > REPHRASE_SIMILARITY
+    except Exception:
+        return False
+def check_prior_reference(query: str) -> bool:
+    """Return True when the query explicitly refers to an earlier answer."""
+    lowered = str(query or "").lower()
+    return any(phrase in lowered for phrase in PRIOR_REFERENCE_PHRASES)
+def _embed_query(query: str) -> list[float]:
+    from pluto.embedder import embed_texts
+    embeddings = embed_texts([query])
+    return embeddings[0] if embeddings else []
+def _cosine_similarity(a: list[float], b: list[float]) -> float:
+    if not a or not b:
+        return 0.0
+    dot = sum(x * y for x, y in zip(a, b))
+    mag_a = sum(x * x for x in a) ** 0.5
+    mag_b = sum(y * y for y in b) ** 0.5
+    if mag_a == 0 or mag_b == 0:
+        return 0.0
+    return dot / (mag_a * mag_b)

mp1/pluto/stages/__init__.py CHANGED Viewed

	@@ -1 +1,2 @@
1	- ~~"""Pipeline~~ ~~stages~~: ~~route~~ ~~→ extract → merge → verify."""~~


1	+ # -- coding: utf-8 --
2	+ """Pipeline stages: route -> extract -> merge -> evidence_check."""

mp1/pluto/stages/{verify.py → evidence_check.py} RENAMED Viewed

@@ -1,7 +1,8 @@
 """
-pluto/stages/verify.py — S3 VERIFY stage.
-Cross-checks merged claims against extracted evidence. The verifier now ranks
 candidate evidence first, uses direct support when the match is obvious, and
 falls back to an LLM only for ambiguous cases. This improves both speed and
 confidence stability.
@@ -20,11 +21,11 @@ from pluto.models import (
     Evidence,
     ExtractOutput,
     MergeOutput,
-    Verification,
-    VerifyOutput,
 )
 from pluto.tracer import Tracer
-from pluto.utils import coerce_string, ensure_list, extract_json_from_response, pair_string_lists
 DIRECT_SUPPORT_THRESHOLD = 0.72
 LLM_CHECK_THRESHOLD = 0.18
@@ -95,14 +96,14 @@ METHOD_HINTS = {
 }
-def run_verify(
     merge_output: MergeOutput,
     extractions: list[ExtractOutput],
     tracer: Tracer,
     bus: MessageBus | None = None,
-) -> VerifyOutput:
-    """S3 — Verify: cross-check merged claims against extraction evidence."""
-    tracer.log("stage_start", {"stage": "verify"})
     claims_to_check = [kc.claim.strip() for kc in merge_output.synthesis.key_claims if kc.claim.strip()]
     evidence_pool = _build_evidence_pool(extractions)
@@ -123,16 +124,16 @@ def run_verify(
             else:
                 shortlisted = candidates[:MAX_EVIDENCE_CANDIDATES]
                 if top["score"] >= LLM_CHECK_THRESHOLD:
-                    prompt = _VERIFY_PROMPT.format(
                         claims_json=json.dumps([{"claim": claim}], indent=1),
                         evidence_json=json.dumps([_prompt_evidence(item) for item in shortlisted], indent=1),
                     )
-                    verdict = _parse_verify_json(dispatch("MODE_QUICK", prompt, tracer=tracer))
                     status, evidence = _extract_single_verdict(verdict, shortlisted)
                     if status == ClaimStatus.UNCERTAIN and top["score"] >= UNCERTAIN_THRESHOLD:
-                        verdict = _parse_verify_json(dispatch("MODE_REASONING", prompt, tracer=tracer))
                         status, evidence = _extract_single_verdict(verdict, shortlisted)
                     if status is not None:
@@ -157,10 +158,10 @@ def run_verify(
     if _should_generate_followups(checked_results):
         gaps = _build_followups(unsupported)
         if bus and gaps:
-            bus.post("verifier", "gap_report", {"gaps": gaps})
-    result = VerifyOutput(
-        verification=Verification(
             checked_claims=checked_results,
             unsupported_claims=[item.claim for item in checked_results if item.status == ClaimStatus.UNSUPPORTED],
             required_followups=gaps,
@@ -170,7 +171,7 @@ def run_verify(
     tracer.log(
         "stage_complete",
         {
-            "stage": "verify",
             "checked": len(checked_results),
             "supported": sum(1 for item in checked_results if item.status == ClaimStatus.SUPPORTED),
             "uncertain": sum(1 for item in checked_results if item.status == ClaimStatus.UNCERTAIN),
@@ -179,7 +180,7 @@ def run_verify(
     return result
-_VERIFY_PROMPT = """You are an evidence verification engine. Check each claim below against the source evidence provided.
 For EACH claim, determine if it is:
 - "supported": the evidence directly or clearly supports the same factual meaning, even if phrased as a paraphrase
@@ -306,14 +307,24 @@ def _extract_single_verdict(v_data: dict, candidates: list[dict]) -> tuple[Claim
     except ValueError:
         return None, []
-    evidence = _parse_evidence_items(item)
-    if not evidence and candidates and status != ClaimStatus.UNSUPPORTED:
         evidence.append(_candidate_to_evidence(candidates[0]))
     return status, evidence
-def _parse_verify_json(raw: str) -> dict:
     try:
         return json.loads(extract_json_from_response(raw))
     except Exception:
@@ -326,12 +337,12 @@ def _parse_verify_json(raw: str) -> dict:
         return {}
-def _parse_verify(raw: str) -> VerifyOutput:
-    """Backward-compatible parser for verifier dumps used by local tests/tools."""
-    data = _parse_verify_json(raw)
     checked_claims = []
-    for item in ensure_list(data.get("checked_claims", [])):
         if not isinstance(item, dict):
             continue
         status_raw = str(item.get("status", "unsupported")).lower()
@@ -340,7 +351,17 @@ def _parse_verify(raw: str) -> VerifyOutput:
         except ValueError:
             status = ClaimStatus.UNSUPPORTED
-        evidence = _parse_evidence_items(item)
         checked_claims.append(
             CheckedClaim(
@@ -358,8 +379,8 @@ def _parse_verify(raw: str) -> VerifyOutput:
     if not isinstance(required_followups, list):
         required_followups = []
-    return VerifyOutput(
-        verification=Verification(
             checked_claims=checked_claims,
             unsupported_claims=unsupported_claims,
             required_followups=required_followups,
@@ -367,46 +388,6 @@ def _parse_verify(raw: str) -> VerifyOutput:
     )
-def _parse_evidence_items(raw_item: dict) -> list[Evidence]:
-    """Normalize verifier evidence from nested refs or scalar/list doc/chunk ids."""
-    evidence: list[Evidence] = []
-    raw_refs = raw_item.get("evidence") or raw_item.get("evidence_refs") or []
-    for ref in ensure_list(raw_refs):
-        if not isinstance(ref, dict):
-            continue
-        for doc_id, chunk_id in pair_string_lists(
-            ref.get("doc_id") or ref.get("evidence_doc_id") or ref.get("doc_ids"),
-            ref.get("chunk_id") or ref.get("evidence_chunk_id") or ref.get("chunk_ids"),
-        ):
-            evidence.append(
-                Evidence(
-                    doc_id=doc_id,
-                    chunk_id=chunk_id,
-                    where=coerce_string(ref.get("where", ""), default=""),
-                    quote=coerce_string(ref.get("quote", ""), default="")[:200],
-                )
-            )
-    if evidence:
-        return evidence
-    for doc_id, chunk_id in pair_string_lists(
-        raw_item.get("evidence_doc_id") or raw_item.get("evidence_doc_ids"),
-        raw_item.get("evidence_chunk_id") or raw_item.get("evidence_chunk_ids"),
-    ):
-        evidence.append(
-            Evidence(
-                doc_id=doc_id,
-                chunk_id=chunk_id,
-                where=coerce_string(raw_item.get("where", ""), default=""),
-                quote=coerce_string(raw_item.get("quote", ""), default="")[:200],
-            )
-        )
-    return evidence
 def _should_generate_followups(checked_results: list[CheckedClaim]) -> bool:
     unsupported_count = sum(1 for item in checked_results if item.status == ClaimStatus.UNSUPPORTED)
     if unsupported_count == 0:

+# -*- coding: utf-8 -*-
 """
+pluto/stages/evidence_check.py — S3 EvidenceCheck stage.
+Cross-checks merged claims against extracted evidence. The evidence_checker now ranks
 candidate evidence first, uses direct support when the match is obvious, and
 falls back to an LLM only for ambiguous cases. This improves both speed and
 confidence stability.
     Evidence,
     ExtractOutput,
     MergeOutput,
+    EvidenceCheck,
+    EvidenceCheckOutput,
 )
 from pluto.tracer import Tracer
+from pluto.utils import extract_json_from_response
 DIRECT_SUPPORT_THRESHOLD = 0.72
 LLM_CHECK_THRESHOLD = 0.18
 }
+def run_evidence_check(
     merge_output: MergeOutput,
     extractions: list[ExtractOutput],
     tracer: Tracer,
     bus: MessageBus | None = None,
+) -> EvidenceCheckOutput:
+    """S3 — EvidenceCheck: cross-check merged claims against extraction evidence."""
+    tracer.log("stage_start", {"stage": "evidence_check"})
     claims_to_check = [kc.claim.strip() for kc in merge_output.synthesis.key_claims if kc.claim.strip()]
     evidence_pool = _build_evidence_pool(extractions)
             else:
                 shortlisted = candidates[:MAX_EVIDENCE_CANDIDATES]
                 if top["score"] >= LLM_CHECK_THRESHOLD:
+                    prompt = _EVIDENCE_CHECK_PROMPT.format(
                         claims_json=json.dumps([{"claim": claim}], indent=1),
                         evidence_json=json.dumps([_prompt_evidence(item) for item in shortlisted], indent=1),
                     )
+                    verdict = _parse_evidence_check_json(dispatch("MODE_QUICK", prompt, tracer=tracer))
                     status, evidence = _extract_single_verdict(verdict, shortlisted)
                     if status == ClaimStatus.UNCERTAIN and top["score"] >= UNCERTAIN_THRESHOLD:
+                        verdict = _parse_evidence_check_json(dispatch("MODE_REASONING", prompt, tracer=tracer))
                         status, evidence = _extract_single_verdict(verdict, shortlisted)
                     if status is not None:
     if _should_generate_followups(checked_results):
         gaps = _build_followups(unsupported)
         if bus and gaps:
+            bus.post("evidence_checker", "gap_report", {"gaps": gaps})
+    result = EvidenceCheckOutput(
+        evidence_check=EvidenceCheck(
             checked_claims=checked_results,
             unsupported_claims=[item.claim for item in checked_results if item.status == ClaimStatus.UNSUPPORTED],
             required_followups=gaps,
     tracer.log(
         "stage_complete",
         {
+            "stage": "evidence_check",
             "checked": len(checked_results),
             "supported": sum(1 for item in checked_results if item.status == ClaimStatus.SUPPORTED),
             "uncertain": sum(1 for item in checked_results if item.status == ClaimStatus.UNCERTAIN),
     return result
+_EVIDENCE_CHECK_PROMPT = """You are an evidence checking engine. Check each claim below against the source evidence provided.
 For EACH claim, determine if it is:
 - "supported": the evidence directly or clearly supports the same factual meaning, even if phrased as a paraphrase
     except ValueError:
         return None, []
+    evidence = []
+    doc_id = item.get("evidence_doc_id")
+    chunk_id = item.get("evidence_chunk_id")
+    if doc_id:
+        evidence.append(
+            Evidence(
+                doc_id=doc_id,
+                chunk_id=chunk_id or "",
+                quote=item.get("quote", ""),
+            )
+        )
+    elif candidates and status != ClaimStatus.UNSUPPORTED:
         evidence.append(_candidate_to_evidence(candidates[0]))
     return status, evidence
+def _parse_evidence_check_json(raw: str) -> dict:
     try:
         return json.loads(extract_json_from_response(raw))
     except Exception:
         return {}
+def _parse_evidence_check(raw: str) -> EvidenceCheckOutput:
+    """Backward-compatible parser for evidence_checker dumps used by local tests/tools."""
+    data = _parse_evidence_check_json(raw)
     checked_claims = []
+    for item in data.get("checked_claims", []):
         if not isinstance(item, dict):
             continue
         status_raw = str(item.get("status", "unsupported")).lower()
         except ValueError:
             status = ClaimStatus.UNSUPPORTED
+        evidence = []
+        doc_id = item.get("evidence_doc_id")
+        if doc_id:
+            evidence.append(
+                Evidence(
+                    doc_id=doc_id,
+                    chunk_id=item.get("evidence_chunk_id", ""),
+                    where=item.get("where", ""),
+                    quote=item.get("quote", ""),
+                )
+            )
         checked_claims.append(
             CheckedClaim(
     if not isinstance(required_followups, list):
         required_followups = []
+    return EvidenceCheckOutput(
+        evidence_check=EvidenceCheck(
             checked_claims=checked_claims,
             unsupported_claims=unsupported_claims,
             required_followups=required_followups,
     )
 def _should_generate_followups(checked_results: list[CheckedClaim]) -> bool:
     unsupported_count = sum(1 for item in checked_results if item.status == ClaimStatus.UNSUPPORTED)
     if unsupported_count == 0:

mp1/pluto/stages/extract.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/stages/extract.py — S1 EXTRACT stage.

+# -*- coding: utf-8 -*-
 """
 pluto/stages/extract.py — S1 EXTRACT stage.

mp1/pluto/stages/merge.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/stages/merge.py — S2 MERGE stage.
@@ -27,7 +28,6 @@ from pluto.models import (
     Synthesis,
 )
 from pluto.tracer import Tracer
-from pluto.utils import coerce_string, coerce_string_list, ensure_list, pair_string_lists
 _BATCH_PROMPT = """You are synthesizing extracted facts from a document chunk batch. Produce a focused sub-summary for the user's question.
@@ -315,18 +315,20 @@ def _parse_merge(raw: str) -> MergeOutput:
             section=sec.get("section", ""),
             points=sec.get("points", []),
         )
-        for sec in ensure_list(data.get("answer_outline", []))
         if isinstance(sec, dict)
         if sec.get("section") or sec.get("points")
     ]
     key_claims: list[KeyClaim] = []
-    for kc in ensure_list(data.get("key_claims", [])):
         if not isinstance(kc, dict):
             continue
-        evidence_refs = _parse_evidence_refs(kc)
-        support_str = coerce_string(kc.get("support", "supported"), default="supported").lower()
         try:
             support = ClaimStatus(support_str)
         except ValueError:
@@ -368,8 +370,6 @@ def _stabilize_merge(result: MergeOutput, query: str = "", detail_level: str = "
         outline = _synthesize_outline_from_claims(key_claims, query=query, detail_level=detail_level)
     elif outline:
         outline = _top_up_outline(outline, key_claims, detail_level=detail_level)
-    if detail_level == "detailed" and key_claims:
-        outline = _enrich_detailed_outline(outline, key_claims, query=query)
     return MergeOutput(
         synthesis=Synthesis(
@@ -559,73 +559,6 @@ def _top_up_outline(
     return outline
-def _enrich_detailed_outline(
-    outline: list[SectionPoint],
-    key_claims: list[KeyClaim],
-    query: str = "",
-) -> list[SectionPoint]:
-    """Guarantee richer structure for detailed mode when evidence is available."""
-    synthesized = _synthesize_outline_from_claims(key_claims, query=query, detail_level="detailed")
-    if not synthesized:
-        return outline
-    if not outline:
-        return synthesized
-    return _merge_outline_variants(outline, synthesized, point_cap=7, section_cap=5)
-def _merge_outline_variants(
-    primary: list[SectionPoint],
-    secondary: list[SectionPoint],
-    point_cap: int,
-    section_cap: int,
-) -> list[SectionPoint]:
-    """Merge outline variants while preserving order and deduplicating points."""
-    merged: list[SectionPoint] = []
-    title_to_index: dict[str, int] = {}
-    def add_section(section: SectionPoint) -> None:
-        title = _clean_text(section.section)
-        if not title:
-            return
-        title_key = _fingerprint(title)
-        clean_points: list[str] = []
-        seen_local: set[str] = set()
-        for point in section.points:
-            text = _clean_text(point)
-            fingerprint = _fingerprint(text)
-            if not text or fingerprint in seen_local:
-                continue
-            seen_local.add(fingerprint)
-            clean_points.append(text)
-        if not clean_points:
-            return
-        if title_key in title_to_index:
-            existing = merged[title_to_index[title_key]]
-            seen_existing = {_fingerprint(point) for point in existing.points}
-            for point in clean_points:
-                fingerprint = _fingerprint(point)
-                if fingerprint in seen_existing or len(existing.points) >= point_cap:
-                    continue
-                existing.points.append(point)
-                seen_existing.add(fingerprint)
-            return
-        if len(merged) >= section_cap:
-            return
-        title_to_index[title_key] = len(merged)
-        merged.append(SectionPoint(section=title, points=clean_points[:point_cap]))
-    for section in primary:
-        add_section(section)
-    for section in secondary:
-        add_section(section)
-    return merged or primary or secondary
 def _normalize_detail_level(detail_level: str | None) -> str:
     return "detailed" if str(detail_level or "").strip().lower() == "detailed" else "standard"
@@ -706,43 +639,3 @@ def _normalize_open_gaps(raw_open_gaps) -> list[str]:
         if text:
             normalized.append(text)
     return normalized
-def _parse_evidence_refs(raw_item: dict) -> list[Evidence]:
-    """Normalize evidence refs from scalar, list, or nested-object shapes."""
-    evidence_refs: list[Evidence] = []
-    raw_refs = raw_item.get("evidence_refs") or raw_item.get("evidence") or []
-    for ref in ensure_list(raw_refs):
-        if not isinstance(ref, dict):
-            continue
-        for doc_id, chunk_id in pair_string_lists(
-            ref.get("doc_id") or ref.get("evidence_doc_id") or ref.get("doc_ids"),
-            ref.get("chunk_id") or ref.get("evidence_chunk_id") or ref.get("chunk_ids"),
-        ):
-            evidence_refs.append(
-                Evidence(
-                    doc_id=doc_id,
-                    chunk_id=chunk_id,
-                    where=coerce_string(ref.get("where", ""), default=""),
-                    quote=coerce_string(ref.get("quote", ""), default="")[:200],
-                )
-            )
-    if evidence_refs:
-        return _dedupe_evidence_refs(evidence_refs)
-    for doc_id, chunk_id in pair_string_lists(
-        raw_item.get("evidence_doc_ids") or raw_item.get("evidence_doc_id"),
-        raw_item.get("evidence_chunk_ids") or raw_item.get("evidence_chunk_id"),
-    ):
-        evidence_refs.append(Evidence(doc_id=doc_id, chunk_id=chunk_id))
-    # Last-resort fallback when the model emits one combined evidence object.
-    if not evidence_refs:
-        chunk_ids = coerce_string_list(raw_item.get("chunk_ids") or raw_item.get("chunk_id"))
-        doc_ids = coerce_string_list(raw_item.get("doc_ids") or raw_item.get("doc_id"))
-        for doc_id, chunk_id in pair_string_lists(doc_ids, chunk_ids):
-            evidence_refs.append(Evidence(doc_id=doc_id, chunk_id=chunk_id))
-    return _dedupe_evidence_refs(evidence_refs)

+# -*- coding: utf-8 -*-
 """
 pluto/stages/merge.py — S2 MERGE stage.
     Synthesis,
 )
 from pluto.tracer import Tracer
 _BATCH_PROMPT = """You are synthesizing extracted facts from a document chunk batch. Produce a focused sub-summary for the user's question.
             section=sec.get("section", ""),
             points=sec.get("points", []),
         )
+        for sec in data.get("answer_outline", [])
         if isinstance(sec, dict)
         if sec.get("section") or sec.get("points")
     ]
     key_claims: list[KeyClaim] = []
+    for kc in data.get("key_claims", []):
         if not isinstance(kc, dict):
             continue
+        evidence_refs = []
+        for doc_id, chunk_id in zip(kc.get("evidence_doc_ids") or [], kc.get("evidence_chunk_ids") or []):
+            evidence_refs.append(Evidence(doc_id=doc_id or "", chunk_id=chunk_id or ""))
+        support_str = str(kc.get("support", "supported")).lower()
         try:
             support = ClaimStatus(support_str)
         except ValueError:
         outline = _synthesize_outline_from_claims(key_claims, query=query, detail_level=detail_level)
     elif outline:
         outline = _top_up_outline(outline, key_claims, detail_level=detail_level)
     return MergeOutput(
         synthesis=Synthesis(
     return outline
 def _normalize_detail_level(detail_level: str | None) -> str:
     return "detailed" if str(detail_level or "").strip().lower() == "detailed" else "standard"
         if text:
             normalized.append(text)
     return normalized

mp1/pluto/stages/route.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/stages/route.py — S0 ROUTE stage (Phase B).

+# -*- coding: utf-8 -*-
 """
 pluto/stages/route.py — S0 ROUTE stage (Phase B).

mp1/pluto/stages/understand.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/stages/understand.py — Phase A: Document Understanding.

+# -*- coding: utf-8 -*-
 """
 pluto/stages/understand.py — Phase A: Document Understanding.

mp1/pluto/tools.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/tools.py — Corpus access tools (spec §3).
@@ -115,6 +116,8 @@ class CorpusTools:
             return ""
         if 0 <= idx < len(chunks):
             raw = chunks[idx]
             # Inject context header so extraction agents know where this chunk sits
             from pluto.embedder import inject_context_headers
             with_header = inject_context_headers([raw], doc_id, self.doc_index)

+# -*- coding: utf-8 -*-
 """
 pluto/tools.py — Corpus access tools (spec §3).
             return ""
         if 0 <= idx < len(chunks):
             raw = chunks[idx]
+            from pluto.doc_summary import apply_doc_summary_context
+            raw = apply_doc_summary_context(raw, doc_id, self.corpus_dir)
             # Inject context header so extraction agents know where this chunk sits
             from pluto.embedder import inject_context_headers
             with_header = inject_context_headers([raw], doc_id, self.doc_index)

mp1/pluto/tracer.py CHANGED Viewed

@@ -1,3 +1,4 @@
 """
 pluto/tracer.py — Logging & trace system for pipeline execution.

+# -*- coding: utf-8 -*-
 """
 pluto/tracer.py — Logging & trace system for pipeline execution.

mp1/pluto/utils.py CHANGED Viewed

@@ -1,28 +1,11 @@
 """
 pluto/utils.py — Shared utilities for response parsing.
 """
 from __future__ import annotations
-import json
 import re
-from itertools import zip_longest
-_PREFERRED_TEXT_KEYS = (
-    "chunk_id",
-    "doc_id",
-    "value",
-    "text",
-    "title",
-    "label",
-    "name",
-    "id",
-    "where",
-    "quote",
-    "claim",
-    "section",
-)
 def strip_think_block(text: str) -> str:
@@ -46,81 +29,3 @@ def extract_json_from_response(raw: str) -> str:
         return brace_match.group(0).strip()
     return cleaned.strip()
-def ensure_list(value):
-    """Return *value* as a list while preserving existing lists."""
-    if value is None:
-        return []
-    if isinstance(value, list):
-        return value
-    if isinstance(value, (tuple, set)):
-        return list(value)
-    return [value]
-def flatten_string_values(value) -> list[str]:
-    """Flatten nested scalars/collections into a list of non-empty strings."""
-    values: list[str] = []
-    def _walk(item) -> None:
-        if item is None:
-            return
-        if isinstance(item, dict):
-            for key in _PREFERRED_TEXT_KEYS:
-                if key in item and item[key] not in (None, ""):
-                    _walk(item[key])
-                    return
-            dumped = json.dumps(item, ensure_ascii=False, sort_keys=True).strip()
-            if dumped:
-                values.append(dumped)
-            return
-        if isinstance(item, (list, tuple, set)):
-            for part in item:
-                _walk(part)
-            return
-        text = str(item).strip()
-        if text:
-            values.append(text)
-    _walk(value)
-    return values
-def coerce_string(value, default: str = "") -> str:
-    """Normalize mixed scalar/list inputs into one printable string."""
-    parts = flatten_string_values(value)
-    return ", ".join(parts) if parts else default
-def coerce_string_list(value) -> list[str]:
-    """Normalize mixed scalar/list inputs into a deduplicated string list."""
-    seen: set[str] = set()
-    normalized: list[str] = []
-    for item in flatten_string_values(value):
-        if item in seen:
-            continue
-        seen.add(item)
-        normalized.append(item)
-    return normalized
-def pair_string_lists(left, right) -> list[tuple[str, str]]:
-    """Broadcast or zip mixed scalar/list inputs into string pairs."""
-    left_items = coerce_string_list(left)
-    right_items = coerce_string_list(right)
-    if not left_items and not right_items:
-        return []
-    if not left_items:
-        left_items = [""]
-    if not right_items:
-        right_items = [""]
-    if len(left_items) == 1 and len(right_items) > 1:
-        return [(left_items[0], item) for item in right_items]
-    if len(right_items) == 1 and len(left_items) > 1:
-        return [(item, right_items[0]) for item in left_items]
-    return list(zip_longest(left_items, right_items, fillvalue=""))

+# -*- coding: utf-8 -*-
 """
 pluto/utils.py — Shared utilities for response parsing.
 """
 from __future__ import annotations
 import re
 def strip_think_block(text: str) -> str:
         return brace_match.group(0).strip()
     return cleaned.strip()

mp1/requirements.txt CHANGED Viewed

@@ -6,8 +6,9 @@ uvicorn>=0.27.0
 python-dotenv>=1.0.0
 pytest>=8.0.0
 python-multipart>=0.0.5
-PyPDF2>=3.0.0
 python-docx>=1.1.0
 requests>=2.31.0
 openai>=1.0.0
 numpy>=1.24.0

 python-dotenv>=1.0.0
 pytest>=8.0.0
 python-multipart>=0.0.5
+pdfplumber>=0.10.0
 python-docx>=1.1.0
 requests>=2.31.0
 openai>=1.0.0
 numpy>=1.24.0
+psycopg2-binary>=2.9.0

mp1/scripts/generate_app_summary_pdf.py ADDED Viewed

	@@ -0,0 +1,367 @@

+from __future__ import annotations
+from dataclasses import dataclass
+from pathlib import Path
+from xml.sax.saxutils import escape
+import fitz
+from pypdf import PdfReader
+from reportlab.lib import colors
+from reportlab.lib.pagesizes import A4
+from reportlab.lib.styles import ParagraphStyle
+from reportlab.pdfgen import canvas
+from reportlab.platypus import Paragraph
+ROOT = Path(__file__).resolve().parents[1]
+OUTPUT_DIR = ROOT / "output" / "pdf"
+TMP_DIR = ROOT / "tmp" / "pdfs"
+PDF_PATH = OUTPUT_DIR / "pluto_app_summary_one_page.pdf"
+PNG_PATH = TMP_DIR / "pluto_app_summary_one_page-1.png"
+PAGE_WIDTH, PAGE_HEIGHT = A4
+MARGIN_X = 34
+MARGIN_TOP = 34
+MARGIN_BOTTOM = 28
+GUTTER = 16
+COLUMN_WIDTH = (PAGE_WIDTH - (2 * MARGIN_X) - GUTTER) / 2
+NAVY = colors.HexColor("#17324D")
+TEAL = colors.HexColor("#2A7F8C")
+INK = colors.HexColor("#1D2430")
+MUTED = colors.HexColor("#5C6773")
+CARD_BG = colors.HexColor("#F5F8FB")
+CARD_BORDER = colors.HexColor("#D6E1EA")
+ACCENT_BG = colors.HexColor("#E8F4F4")
+WHITE = colors.white
+@dataclass
+class SectionBlock:
+    title: str
+    items: list[tuple[str, str]]
+def build_blocks() -> tuple[list[SectionBlock], list[SectionBlock]]:
+    left = [
+        SectionBlock(
+            title="What it is",
+            items=[
+                (
+                    "body",
+                    "Pluto is an AI-powered document extraction and question-answering app. "
+                    "It lets a user upload documents into a corpus, run a multi-stage pipeline, "
+                    "and inspect the answer with evidence, trace, and confidence signals.",
+                ),
+            ],
+        ),
+        SectionBlock(
+            title="Who it's for",
+            items=[
+                ("body", "Primary user/persona: Not found in repo."),
+                (
+                    "body",
+                    "Closest repo evidence: a person asking research-style questions "
+                    "over uploaded documents and reviewing evidence-backed results.",
+                ),
+            ],
+        ),
+        SectionBlock(
+            title="What it does",
+            items=[
+                ("bullet", "Uploads PDF, DOCX/DOC, TXT, and Markdown files into a corpus."),
+                ("bullet", "Converts uploads to Markdown, chunks them, classifies them, and tracks readiness."),
+                ("bullet", "Runs a 4-stage pipeline: route, extract, merge, evidence_check."),
+                ("bullet", "Streams live progress and upload status to the dashboard."),
+                ("bullet", "Queries the full corpus or selected ready documents."),
+                ("bullet", "Shows final sections, evidence, trace, confidence, and a benchmark view."),
+            ],
+        ),
+    ]
+    right = [
+        SectionBlock(
+            title="How it works",
+            items=[
+                (
+                    "bullet",
+                    "Frontend: `frontend/index.html` + `app.js` call `/api/upload`, `/api/corpus`, "
+                    "`/api/run`, `/api/stream`, and `/api/compare`.",
+                ),
+                (
+                    "bullet",
+                    "Server: `pluto/server.py` serves the UI, handles uploads, streams SSE progress, "
+                    "and runs `PipelineRunner` in a worker thread.",
+                ),
+                (
+                    "bullet",
+                    "Ingest path: uploaded file -> Markdown in `corpus/` -> chunk split/classification "
+                    "-> `DocIndex` registration; background Phase A stores overview/status in "
+                    "`corpus/.doc_index.json`.",
+                ),
+                (
+                    "bullet",
+                    "Query path: selected docs -> S0 route -> S1 extract -> S2 merge -> S3 evidence_check "
+                    "-> JSON result + cache stats -> UI panels; final JSON also writes to "
+                    "`output/final_output.json`.",
+                ),
+                (
+                    "bullet",
+                    "Support layers: `ExtractionCache` reuses extractions; `CorpusTools` reads/searches chunks. "
+                    "NVIDIA embedding/rerank code paths exist, and chunking falls back when NVIDIA keys are absent.",
+                ),
+            ],
+        ),
+        SectionBlock(
+            title="How to run",
+            items=[
+                ("bullet", "From `mp1/`: `pip install -r requirements.txt`"),
+                (
+                    "bullet",
+                    "Create `.env` and set `GROQ_API_KEY` (explicitly named in `README.md`).",
+                ),
+                ("bullet", "Run `python main.py --serve`"),
+                ("bullet", "Open `http://localhost:8000`"),
+                (
+                    "bullet",
+                    "Upload docs, wait for Understanding to finish, then submit a query. "
+                    "Other required provider keys: Not found in repo.",
+                ),
+            ],
+        ),
+    ]
+    return left, right
+def make_styles(scale: float) -> dict[str, ParagraphStyle]:
+    return {
+        "title": ParagraphStyle(
+            "title",
+            fontName="Helvetica-Bold",
+            fontSize=21 * scale,
+            leading=25 * scale,
+            textColor=WHITE,
+            spaceAfter=0,
+        ),
+        "subtitle": ParagraphStyle(
+            "subtitle",
+            fontName="Helvetica",
+            fontSize=9.6 * scale,
+            leading=12 * scale,
+            textColor=colors.HexColor("#DCE7F3"),
+        ),
+        "eyebrow": ParagraphStyle(
+            "eyebrow",
+            fontName="Helvetica-Bold",
+            fontSize=7.4 * scale,
+            leading=9 * scale,
+            textColor=colors.HexColor("#B9D6DA"),
+        ),
+        "section_title": ParagraphStyle(
+            "section_title",
+            fontName="Helvetica-Bold",
+            fontSize=10.6 * scale,
+            leading=12.5 * scale,
+            textColor=NAVY,
+        ),
+        "body": ParagraphStyle(
+            "body",
+            fontName="Helvetica",
+            fontSize=8.6 * scale,
+            leading=11 * scale,
+            textColor=INK,
+        ),
+        "bullet": ParagraphStyle(
+            "bullet",
+            fontName="Helvetica",
+            fontSize=8.5 * scale,
+            leading=10.7 * scale,
+            textColor=INK,
+            leftIndent=10 * scale,
+            firstLineIndent=-7 * scale,
+        ),
+        "footer": ParagraphStyle(
+            "footer",
+            fontName="Helvetica",
+            fontSize=7.1 * scale,
+            leading=8.5 * scale,
+            textColor=MUTED,
+        ),
+    }
+def escape_inline(text: str) -> str:
+    escaped = escape(text)
+    return escaped.replace("`", "<font name='Courier'>").replace("</font><font name='Courier'>", "")
+def format_text(text: str) -> str:
+    parts = text.split("`")
+    if len(parts) == 1:
+        return escape(text)
+    result: list[str] = []
+    code = False
+    for part in parts:
+        if code:
+            result.append(f"<font name='Courier'>{escape(part)}</font>")
+        else:
+            result.append(escape(part))
+        code = not code
+    return "".join(result)
+def paragraph_for(kind: str, text: str, styles: dict[str, ParagraphStyle]) -> Paragraph:
+    style_name = "bullet" if kind == "bullet" else "body"
+    content = f"- {text}" if kind == "bullet" else text
+    return Paragraph(format_text(content), styles[style_name])
+def measure_section(block: SectionBlock, styles: dict[str, ParagraphStyle], width: float) -> tuple[float, list[Paragraph]]:
+    title = Paragraph(format_text(block.title), styles["section_title"])
+    rendered_items = [paragraph_for(kind, text, styles) for kind, text in block.items]
+    title_height = title.wrap(width - 20, 1000)[1]
+    items_height = 0.0
+    for para in rendered_items:
+        items_height += para.wrap(width - 20, 1000)[1]
+        items_height += 5
+    total = 14 + title_height + 8 + items_height + 10
+    return total, [title, *rendered_items]
+def choose_scale(left: list[SectionBlock], right: list[SectionBlock]) -> tuple[float, dict[str, ParagraphStyle], float]:
+    header_space = 114
+    footer_space = 18
+    available = PAGE_HEIGHT - MARGIN_TOP - MARGIN_BOTTOM - header_space - footer_space
+    for scale in (1.0, 0.97, 0.94, 0.91, 0.88, 0.85):
+        styles = make_styles(scale)
+        left_height = total_column_height(left, styles)
+        right_height = total_column_height(right, styles)
+        if max(left_height, right_height) <= available:
+            return scale, styles, available
+    raise RuntimeError("Content did not fit on a single page.")
+def total_column_height(blocks: list[SectionBlock], styles: dict[str, ParagraphStyle]) -> float:
+    total = 0.0
+    for index, block in enumerate(blocks):
+        section_height, _ = measure_section(block, styles, COLUMN_WIDTH)
+        total += section_height
+        if index < len(blocks) - 1:
+            total += 10
+    return total
+def draw_header(pdf: canvas.Canvas, styles: dict[str, ParagraphStyle]) -> float:
+    header_height = 94
+    header_y = PAGE_HEIGHT - MARGIN_TOP - header_height
+    pdf.setFillColor(NAVY)
+    pdf.roundRect(MARGIN_X, header_y, PAGE_WIDTH - (2 * MARGIN_X), header_height, 14, stroke=0, fill=1)
+    pdf.setFillColor(TEAL)
+    pdf.roundRect(PAGE_WIDTH - MARGIN_X - 110, header_y, 110, header_height, 14, stroke=0, fill=1)
+    eyebrow = Paragraph("ONE-PAGE APP SUMMARY", styles["eyebrow"])
+    title = Paragraph("Pluto", styles["title"])
+    subtitle = Paragraph(
+        "Repo-backed overview of the document extraction and question-answering dashboard.",
+        styles["subtitle"],
+    )
+    x = MARGIN_X + 18
+    y = PAGE_HEIGHT - MARGIN_TOP - 16
+    for para, width in ((eyebrow, 210), (title, 260), (subtitle, PAGE_WIDTH - (2 * MARGIN_X) - 150)):
+        _, height = para.wrap(width, 1000)
+        para.drawOn(pdf, x, y - height)
+        y -= height + 4
+    note = Paragraph("Evidence source: README + app server, pipeline, ingest, index, and UI files.", styles["subtitle"])
+    note_width = 92
+    _, note_height = note.wrap(note_width, 1000)
+    note.drawOn(pdf, PAGE_WIDTH - MARGIN_X - 98, header_y + header_height - 18 - note_height)
+    return header_y - 12
+def draw_column(
+    pdf: canvas.Canvas,
+    blocks: list[SectionBlock],
+    x: float,
+    top_y: float,
+    styles: dict[str, ParagraphStyle],
+) -> None:
+    y = top_y
+    for block in blocks:
+        section_height, items = measure_section(block, styles, COLUMN_WIDTH)
+        pdf.setFillColor(CARD_BG if block.title != "How to run" else ACCENT_BG)
+        pdf.setStrokeColor(CARD_BORDER)
+        pdf.roundRect(x, y - section_height, COLUMN_WIDTH, section_height, 12, stroke=1, fill=1)
+        cursor = y - 14
+        title = items[0]
+        _, title_height = title.wrap(COLUMN_WIDTH - 20, 1000)
+        title.drawOn(pdf, x + 10, cursor - title_height)
+        cursor -= title_height + 8
+        for para in items[1:]:
+            _, para_height = para.wrap(COLUMN_WIDTH - 20, 1000)
+            para.drawOn(pdf, x + 10, cursor - para_height)
+            cursor -= para_height + 5
+        y -= section_height + 10
+def build_pdf() -> None:
+    OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
+    TMP_DIR.mkdir(parents=True, exist_ok=True)
+    left, right = build_blocks()
+    _, styles, _ = choose_scale(left, right)
+    pdf = canvas.Canvas(str(PDF_PATH), pagesize=A4)
+    pdf.setTitle("Pluto App Summary")
+    pdf.setAuthor("OpenAI Codex")
+    pdf.setSubject("One-page summary generated from repository evidence")
+    top_y = draw_header(pdf, styles)
+    draw_column(pdf, left, MARGIN_X, top_y, styles)
+    draw_column(pdf, right, MARGIN_X + COLUMN_WIDTH + GUTTER, top_y, styles)
+    footer = Paragraph(
+        "Not found in repo items are labeled explicitly. Output generated as a single-page PDF.",
+        styles["footer"],
+    )
+    _, footer_height = footer.wrap(PAGE_WIDTH - (2 * MARGIN_X), 1000)
+    footer.drawOn(pdf, MARGIN_X, MARGIN_BOTTOM - 4)
+    pdf.showPage()
+    pdf.save()
+def validate_outputs() -> None:
+    reader = PdfReader(str(PDF_PATH))
+    if len(reader.pages) != 1:
+        raise RuntimeError(f"Expected 1 page, found {len(reader.pages)}")
+    document = fitz.open(PDF_PATH)
+    page = document.load_page(0)
+    pix = page.get_pixmap(matrix=fitz.Matrix(2.0, 2.0), alpha=False)
+    pix.save(PNG_PATH)
+    document.close()
+def main() -> None:
+    build_pdf()
+    validate_outputs()
+    print(f"PDF_PATH={PDF_PATH}")
+    print(f"PNG_PATH={PNG_PATH}")
+if __name__ == "__main__":
+    main()

mp1/test_doc_summary.py ADDED Viewed

	@@ -0,0 +1,75 @@

+# -*- coding: utf-8 -*-
+from pathlib import Path
+from pluto.doc_summary import (
+    DocSummary,
+    apply_doc_summary_context,
+    generate_doc_summary,
+    save_doc_summaries,
+)
+def test_generate_doc_summary_returns_valid_summary_with_mocked_llm(monkeypatch, tmp_path):
+    corpus = tmp_path / "corpus"
+    corpus.mkdir()
+    (corpus / "paper.md").write_text("# Paper\n\nThis is about retrieval.", encoding="utf-8")
+    monkeypatch.setattr(
+        "pluto.doc_summary._call_summary_llm",
+        lambda **kwargs: """
+        {
+          "title": "Retrieval Paper",
+          "domain": "information retrieval",
+          "key_claims": ["Chunk context improves retrieval"],
+          "structure": ["intro", "methodology", "results"],
+          "open_questions": ["How robust is it?"]
+        }
+        """,
+    )
+    summary = generate_doc_summary("paper", corpus)
+    assert isinstance(summary, DocSummary)
+    assert summary.doc_id == "paper"
+    assert summary.title == "Retrieval Paper"
+    assert summary.domain == "information retrieval"
+    assert summary.key_claims == ["Chunk context improves retrieval"]
+def test_generate_doc_summary_falls_back_when_llm_fails(monkeypatch, tmp_path):
+    corpus = tmp_path / "corpus"
+    corpus.mkdir()
+    (corpus / "paper.md").write_text("# Paper\n\nBody.", encoding="utf-8")
+    def fail(**kwargs):
+        raise RuntimeError("model unavailable")
+    monkeypatch.setattr("pluto.doc_summary._call_summary_llm", fail)
+    summary = generate_doc_summary("paper", corpus)
+    assert summary.doc_id == "paper"
+    assert summary.title == "paper"
+    assert summary.key_claims == []
+    assert summary.open_questions == []
+def test_context_prefix_is_prepended_to_chunk_text(tmp_path):
+    corpus = tmp_path / "corpus"
+    corpus.mkdir()
+    summary = DocSummary(
+        doc_id="paper",
+        title="Retrieval Paper",
+        domain="AI",
+        key_claims=["Claim A", "Claim B"],
+        structure=[],
+        open_questions=[],
+        created_at="2026-01-01T00:00:00+00:00",
+    )
+    save_doc_summaries(corpus, {"paper": summary})
+    result = apply_doc_summary_context("Original chunk", "paper", corpus)
+    assert result.startswith("[Document context: Retrieval Paper | Domain: AI | Key claims: Claim A; Claim B]")
+    assert result.endswith("Original chunk")

mp1/test_merge.py CHANGED Viewed

@@ -9,7 +9,7 @@ from pluto.models import (
     Synthesis,
 )
 from pluto.stages import merge as merge_stage
-from pluto.stages.merge import _parse_merge, run_merge
 from pluto.tracer import Tracer
@@ -78,117 +78,3 @@ def test_merge_synthesizes_outline_when_model_returns_only_key_claims(monkeypatc
         for section in result.synthesis.answer_outline
         for point in section.points
     )
-def test_parse_merge_normalizes_scalar_doc_and_multi_chunk_evidence():
-    raw = """
-    {
-      "answer_outline": [
-        {
-          "section": "Overview",
-          "points": "The method uses evidence from multiple chunks."
-        }
-      ],
-      "key_claims": [
-        {
-          "claim": "The method is supported across several chunks.",
-          "support": "supported",
-          "evidence_doc_ids": "paper_a",
-          "evidence_chunk_ids": [["C18", "C46", "C81"]]
-        }
-      ],
-      "open_gaps": []
-    }
-    """
-    out = _parse_merge(raw)
-    assert out.synthesis.answer_outline[0].points == ["The method uses evidence from multiple chunks."]
-    refs = out.synthesis.key_claims[0].evidence_refs
-    assert len(refs) == 3
-    assert [ref.doc_id for ref in refs] == ["paper_a", "paper_a", "paper_a"]
-    assert [ref.chunk_id for ref in refs] == ["C18", "C46", "C81"]
-def test_merge_detailed_mode_produces_richer_answer_structure(monkeypatch):
-    raw_merge = """
-    {
-      "answer_outline": [
-        {
-          "section": "Overview",
-          "points": [
-            "The paper introduces a multi-agent defense coordinator.",
-            "The system reports strong defended-scenario performance."
-          ]
-        }
-      ],
-      "key_claims": [
-        {
-          "claim": "The paper introduces a multi-agent defense coordinator for prompt-injection mitigation.",
-          "support": "supported",
-          "evidence_doc_ids": ["multi_agent"],
-          "evidence_chunk_ids": ["C1"]
-        },
-        {
-          "claim": "The evaluation reports 0% ASR across defended scenarios.",
-          "support": "supported",
-          "evidence_doc_ids": ["multi_agent"],
-          "evidence_chunk_ids": ["C2"]
-        },
-        {
-          "claim": "The method routes adversarial prompts through a defense worker.",
-          "support": "supported",
-          "evidence_doc_ids": ["multi_agent"],
-          "evidence_chunk_ids": ["C3"]
-        },
-        {
-          "claim": "The architecture includes a recovery worker for post-attack repair.",
-          "support": "supported",
-          "evidence_doc_ids": ["multi_agent"],
-          "evidence_chunk_ids": ["C4"]
-        },
-        {
-          "claim": "The paper discusses limitations and future work for the coordinator pipeline.",
-          "support": "supported",
-          "evidence_doc_ids": ["multi_agent"],
-          "evidence_chunk_ids": ["C5"]
-        },
-        {
-          "claim": "The benchmark comparison highlights gains over baselines.",
-          "support": "supported",
-          "evidence_doc_ids": ["multi_agent"],
-          "evidence_chunk_ids": ["C6"]
-        }
-      ],
-      "open_gaps": []
-    }
-    """
-    monkeypatch.setattr(merge_stage, "dispatch", lambda *args, **kwargs: raw_merge)
-    extraction = ExtractOutput(
-        doc_id="multi_agent",
-        chunk_id="C1",
-        chunk_type=ChunkType.TEXT,
-        mode_used=ModeName.MODE_REASONING,
-        extracted=ExtractedContent(
-            claims=[
-                Claim(
-                    claim_id="cl1",
-                    text="The paper introduces a multi-agent defense coordinator for prompt-injection mitigation.",
-                    importance=Importance.HIGH,
-                    evidence=Evidence(doc_id="multi_agent", chunk_id="C1", where="overview", quote="multi-agent defense coordinator"),
-                )
-            ],
-            chunk_summary="Coordinator overview and results.",
-        ),
-    )
-    standard = run_merge("Summarize the paper.", [extraction], Tracer(), detail_level="standard")
-    detailed = run_merge("Summarize the paper.", [extraction], Tracer(), detail_level="detailed")
-    standard_points = sum(len(section.points) for section in standard.synthesis.answer_outline)
-    detailed_points = sum(len(section.points) for section in detailed.synthesis.answer_outline)
-    assert len(detailed.synthesis.answer_outline) >= len(standard.synthesis.answer_outline)
-    assert detailed_points > standard_points

     Synthesis,
 )
 from pluto.stages import merge as merge_stage
+from pluto.stages.merge import run_merge
 from pluto.tracer import Tracer
         for section in result.synthesis.answer_outline
         for point in section.points
     )

mp1/test_schema.py DELETED Viewed

@@ -1,41 +0,0 @@
-from pluto.models import Evidence, FinalEvidence, SectionPoint, Verification
-def test_schema_coerces_mixed_scalar_and_list_inputs():
-    evidence = Evidence(
-        doc_id=["paper_a"],
-        chunk_id=["C1", "C2"],
-        where={"text": "results"},
-        quote=["alpha", "beta"],
-    )
-    assert evidence.doc_id == "paper_a"
-    assert evidence.chunk_id == "C1, C2"
-    assert evidence.where == "results"
-    assert evidence.quote == "alpha, beta"
-    final_evidence = FinalEvidence(
-        doc_id="paper_a",
-        chunk_id=["C4", "C5"],
-        where=["method"],
-        supports=["Main claim"],
-        quote=["quoted", "support"],
-    )
-    assert final_evidence.chunk_id == "C4, C5"
-    assert final_evidence.where == "method"
-    assert final_evidence.supports == "Main claim"
-    assert final_evidence.quote == "quoted, support"
-def test_schema_coerces_outline_and_followup_lists():
-    section = SectionPoint(section=["Overview"], points="Single normalized point")
-    verification = Verification(
-        unsupported_claims="Missing metric support",
-        required_followups={"text": "Where is the metric reported?"},
-    )
-    assert section.section == "Overview"
-    assert section.points == ["Single normalized point"]
-    assert verification.unsupported_claims == ["Missing metric support"]
-    assert verification.required_followups == ["Where is the metric reported?"]

mp1/test_server.py CHANGED Viewed

@@ -87,6 +87,7 @@ def test_server_run_forwards_selected_docs_and_detail_level(monkeypatch):
     )
     assert response.status_code == 200
     assert recorded["progress_callback_registered"] is True
     assert recorded["query"] == "summarize this"
     assert recorded["selected_doc_ids"] == ["paper_a"]
@@ -161,8 +162,12 @@ def test_server_exposes_processed_docs_as_ready_even_if_status_is_stale(monkeypa
 def test_stream_progress_serializes_pydantic_payloads(monkeypatch):
-    monkeypatch.setattr(server, "_progress_queue", asyncio.Queue())
-    server._progress_queue.put_nowait({
         "stage": "done",
         "status": "complete",
         "payload": {
@@ -181,7 +186,7 @@ def test_stream_progress_serializes_pydantic_payloads(monkeypatch):
     })
     client = TestClient(server.app)
-    with client.stream("GET", "/api/stream") as response:
         body = b"".join(response.iter_raw()).decode("utf-8")
     assert response.status_code == 200
@@ -189,27 +194,46 @@ def test_stream_progress_serializes_pydantic_payloads(monkeypatch):
     payload = json.loads(body.removeprefix("data: ").strip())
     assert payload["payload"]["plan"][0]["doc_id"] == "paper"
     assert payload["payload"]["plan"][0]["chunk_type"] == "text"
-def test_server_cache_stats_route_returns_json(monkeypatch):
-    class FakeCache:
-        def stats(self):
-            return {"hits": 7, "misses": 3, "entries": 10}
-    monkeypatch.setattr(server, "_extraction_cache", FakeCache())
     client = TestClient(server.app)
-    response = client.get("/api/cache/stats")
-    assert response.status_code == 200
-    assert response.json() == {"hits": 7, "misses": 3, "entries": 10}
-def test_server_result_route_returns_404_when_empty(monkeypatch):
-    monkeypatch.setattr(server, "_latest_result", None)
-    client = TestClient(server.app)
-    response = client.get("/api/result")
-    assert response.status_code == 404
-    assert response.json()["error"] == "No result yet"

     )
     assert response.status_code == 200
+    assert response.json()["session_id"]
     assert recorded["progress_callback_registered"] is True
     assert recorded["query"] == "summarize this"
     assert recorded["selected_doc_ids"] == ["paper_a"]
 def test_stream_progress_serializes_pydantic_payloads(monkeypatch):
+    session_id = "test-session"
+    queue = asyncio.Queue()
+    monkeypatch.setattr(server, "session_queues", {session_id: queue})
+    monkeypatch.setattr(server, "session_results", {session_id: {"ok": True}})
+    monkeypatch.setattr(server, "session_cleanup_tasks", {})
+    queue.put_nowait({
         "stage": "done",
         "status": "complete",
         "payload": {
     })
     client = TestClient(server.app)
+    with client.stream("GET", f"/api/stream?session_id={session_id}") as response:
         body = b"".join(response.iter_raw()).decode("utf-8")
     assert response.status_code == 200
     payload = json.loads(body.removeprefix("data: ").strip())
     assert payload["payload"]["plan"][0]["doc_id"] == "paper"
     assert payload["payload"]["plan"][0]["chunk_type"] == "text"
+    assert session_id in server.session_queues
+    assert session_id in server.session_results
+def test_stream_progress_is_session_scoped(monkeypatch):
+    first = asyncio.Queue()
+    second = asyncio.Queue()
+    first.put_nowait({"stage": "done", "status": "complete", "session_id": "first"})
+    second.put_nowait({"stage": "done", "status": "complete", "session_id": "second"})
+    monkeypatch.setattr(server, "session_queues", {"first": first, "second": second})
+    monkeypatch.setattr(server, "session_results", {"first": {}, "second": {}})
+    monkeypatch.setattr(server, "session_cleanup_tasks", {})
     client = TestClient(server.app)
+    with client.stream("GET", "/api/stream?session_id=second") as response:
+        body = b"".join(response.iter_raw()).decode("utf-8")
+    payload = json.loads(body.removeprefix("data: ").strip())
+    assert payload["session_id"] == "second"
+    assert "first" in server.session_queues
+    assert "second" in server.session_queues
+def test_session_cleanup_is_delayed(monkeypatch):
+    async def run_check():
+        session_id = "cleanup-session"
+        queue = asyncio.Queue()
+        monkeypatch.setattr(server, "SESSION_CLEANUP_DELAY_SECONDS", 0.01)
+        monkeypatch.setattr(server, "session_queues", {session_id: queue})
+        monkeypatch.setattr(server, "session_results", {session_id: {"ok": True}})
+        monkeypatch.setattr(server, "session_cleanup_tasks", {})
+        server._schedule_session_cleanup(session_id, queue)
+        assert session_id in server.session_queues
+        assert session_id in server.session_results
+        await asyncio.sleep(0.05)
+        assert session_id not in server.session_queues
+        assert session_id not in server.session_results
+    asyncio.run(run_check())

mp1/test_session_memory.py ADDED Viewed

	@@ -0,0 +1,112 @@

+# -*- coding: utf-8 -*-
+import asyncio
+import json
+from fastapi.testclient import TestClient
+from pluto.session_memory import (
+    CompressedSession,
+    compress_session,
+    list_session_context,
+)
+import pluto.server as server
+def test_compression_produces_valid_compressed_session(monkeypatch, tmp_path):
+    corpus = tmp_path / "corpus"
+    corpus.mkdir()
+    monkeypatch.setattr(
+        "pluto.session_memory._call_compression_llm",
+        lambda **kwargs: """
+        {
+          "queries_resolved": [{"query": "q", "answer_summary": "a", "chunks_used": 2, "confidence": 0.8}],
+          "key_findings": ["Finding A"],
+          "open_questions": ["Question A"],
+          "links_to_prior_sessions": []
+        }
+        """,
+    )
+    monkeypatch.setattr("pluto.session_memory._store_postgres", lambda compressed, raw_path: None)
+    compressed = compress_session("s1", "doc_a", {"query": "q", "confidence": 0.8}, corpus)
+    assert isinstance(compressed, CompressedSession)
+    assert compressed.session_id == "s1"
+    assert compressed.doc_id == "doc_a"
+    assert compressed.key_findings == ["Finding A"]
+    assert (corpus / ".session_archive" / "s1.json").exists()
+def test_postgres_unavailable_falls_back_to_local_file(monkeypatch, tmp_path):
+    corpus = tmp_path / "corpus"
+    corpus.mkdir()
+    monkeypatch.setattr("pluto.session_memory._call_compression_llm", lambda **kwargs: "{}")
+    def fail_store(compressed, raw_path):
+        raise EnvironmentError("no database")
+    monkeypatch.setattr("pluto.session_memory._store_postgres", fail_store)
+    compressed = compress_session("s2", "doc_b", {"query": "q"}, corpus)
+    path = corpus / ".session_memory" / "s2.json"
+    assert path.exists()
+    assert json.loads(path.read_text(encoding="utf-8"))["session_id"] == compressed.session_id
+def test_warm_start_endpoint_returns_sessions_in_order(monkeypatch):
+    sessions = [
+        {"session_id": "new", "doc_id": "paper", "timestamp": "2026-01-02T00:00:00+00:00"},
+        {"session_id": "old", "doc_id": "paper", "timestamp": "2026-01-01T00:00:00+00:00"},
+    ]
+    monkeypatch.setattr("pluto.session_memory.list_session_context", lambda doc_id, corpus_dir, limit=10: sessions)
+    client = TestClient(server.app)
+    response = client.get("/api/session-context/paper")
+    assert response.status_code == 200
+    payload = response.json()
+    assert [item["session_id"] for item in payload["sessions"]] == ["new", "old"]
+def test_list_session_context_local_fallback_orders_by_timestamp(monkeypatch, tmp_path):
+    corpus = tmp_path / "corpus"
+    memory = corpus / ".session_memory"
+    memory.mkdir(parents=True)
+    (memory / "old.json").write_text(
+        json.dumps({"session_id": "old", "doc_id": "paper", "timestamp": "2026-01-01T00:00:00+00:00"}),
+        encoding="utf-8",
+    )
+    (memory / "new.json").write_text(
+        json.dumps({"session_id": "new", "doc_id": "paper", "timestamp": "2026-01-02T00:00:00+00:00"}),
+        encoding="utf-8",
+    )
+    monkeypatch.setattr("pluto.session_memory._list_postgres", lambda doc_id, limit: (_ for _ in ()).throw(EnvironmentError("no db")))
+    sessions = list_session_context("paper", corpus)
+    assert [item["session_id"] for item in sessions] == ["new", "old"]
+def test_compression_is_scheduled_async_without_blocking_sse(monkeypatch):
+    calls = []
+    async def run_check():
+        session_id = "sse-session"
+        queue = asyncio.Queue()
+        await queue.put({"stage": "done", "status": "complete", "session_id": session_id})
+        monkeypatch.setattr(server, "session_queues", {session_id: queue})
+        monkeypatch.setattr(server, "session_results", {session_id: {"doc_id": "paper"}})
+        monkeypatch.setattr(server, "session_cleanup_tasks", {})
+        monkeypatch.setattr(server, "_schedule_session_compression", lambda sid: calls.append(sid))
+        client = TestClient(server.app)
+        with client.stream("GET", f"/api/stream?session_id={session_id}") as response:
+            body = b"".join(response.iter_raw()).decode("utf-8")
+        assert response.status_code == 200
+        assert '"stage": "done"' in body
+        assert calls == [session_id]
+    asyncio.run(run_check())

mp1/test_signal_logger.py ADDED Viewed

	@@ -0,0 +1,56 @@

+# -*- coding: utf-8 -*-
+from pluto.signal_logger import check_prior_reference, check_rephrase, log_signal
+def test_rephrase_detection_triggers_with_high_similarity_within_window(monkeypatch):
+    embeddings = {
+        "How does it work?": [1.0, 0.0],
+        "Explain how it works": [0.9, 0.1],
+    }
+    monkeypatch.setattr("pluto.signal_logger._embed_query", lambda query: embeddings[query])
+    assert check_rephrase("Explain how it works", "How does it work?", 30) is True
+def test_rephrase_detection_does_not_trigger_outside_window(monkeypatch):
+    monkeypatch.setattr("pluto.signal_logger._embed_query", lambda query: [1.0, 0.0])
+    assert check_rephrase("same", "same", 91) is False
+def test_prior_reference_detection_on_trigger_phrases():
+    assert check_prior_reference("Based on your answer, what is the limitation?") is True
+    assert check_prior_reference("Tell me the limitation") is False
+def test_signal_logging_with_mocked_postgres(monkeypatch):
+    calls = []
+    class FakeCursor:
+        def __enter__(self):
+            return self
+        def __exit__(self, exc_type, exc, tb):
+            return False
+        def execute(self, sql, params=None):
+            calls.append((sql, params))
+    class FakeConnection:
+        def cursor(self):
+            return FakeCursor()
+        def commit(self):
+            calls.append(("commit", None))
+        def close(self):
+            calls.append(("close", None))
+    monkeypatch.setattr("pluto.signal_logger._get_connection", lambda: FakeConnection())
+    log_signal("session-a", "hash-a", "prior_reference")
+    assert calls[0][1] == ("session-a", "hash-a", "prior_reference")
+    assert ("commit", None) in calls
+    assert ("close", None) in calls

mp1/test_verify.py CHANGED Viewed

@@ -12,12 +12,12 @@ from pluto.models import (
     Synthesis,
 )
 from pluto.bus import MessageBus
-from pluto.stages import verify as verify_stage
-from pluto.stages.verify import _parse_verify, run_verify
 from pluto.tracer import Tracer
-def test_parse_verify_dump():
     raw = """
     Here is the result:
     {
@@ -40,45 +40,20 @@ def test_parse_verify_dump():
     }
     """
-    out = _parse_verify(raw)
-    assert len(out.verification.checked_claims) == 2
-    assert out.verification.checked_claims[0].status.value == "supported"
-    assert out.verification.checked_claims[0].evidence[0].doc_id == "paper_a"
-    assert out.verification.unsupported_claims == ["The training set contains 2 million images."]
-    assert out.verification.required_followups == ["Upload the appendix for dataset details."]
-def test_parse_verify_handles_multi_chunk_evidence_ids():
-    raw = """
-    {
-      "checked_claims": [
-        {
-          "claim": "The results are supported across multiple chunks.",
-          "status": "supported",
-          "evidence_doc_id": "paper_a",
-          "evidence_chunk_id": ["C18", "C46", "C81"],
-          "quote": "results are supported"
-        }
-      ],
-      "unsupported_claims": [],
-      "required_followups": []
-    }
-    """
-    out = _parse_verify(raw)
-    evidence = out.verification.checked_claims[0].evidence
-    assert len(evidence) == 3
-    assert [item.doc_id for item in evidence] == ["paper_a", "paper_a", "paper_a"]
-    assert [item.chunk_id for item in evidence] == ["C18", "C46", "C81"]
-def test_verify_directly_supports_matching_claim_without_dispatch(monkeypatch):
     def fail_dispatch(*args, **kwargs):
         raise AssertionError("dispatch should not be called for an obvious direct evidence match")
-    monkeypatch.setattr(verify_stage, "dispatch", fail_dispatch)
     merge_output = MergeOutput(
         synthesis=Synthesis(
@@ -111,19 +86,19 @@ def test_verify_directly_supports_matching_claim_without_dispatch(monkeypatch):
         )
     ]
-    result = run_verify(merge_output, extractions, Tracer())
-    assert len(result.verification.checked_claims) == 1
-    assert result.verification.checked_claims[0].status == ClaimStatus.SUPPORTED
-    assert result.verification.checked_claims[0].evidence[0].doc_id == "paper_a"
-    assert result.verification.unsupported_claims == []
-def test_verify_suppresses_followups_for_single_unsupported_outlier(monkeypatch):
     def fail_dispatch(*args, **kwargs):
         raise AssertionError("dispatch should not be called for direct matches or suppressed followups")
-    monkeypatch.setattr(verify_stage, "dispatch", fail_dispatch)
     merge_output = MergeOutput(
         synthesis=Synthesis(
@@ -170,18 +145,18 @@ def test_verify_suppresses_followups_for_single_unsupported_outlier(monkeypatch)
     ]
     bus = MessageBus()
-    result = run_verify(merge_output, extractions, Tracer(), bus=bus)
-    assert result.verification.unsupported_claims == ["The appendix reports a 12% latency reduction on unseen workloads."]
-    assert result.verification.required_followups == []
     assert bus.read(msg_type="gap_report") == []
-def test_verify_generates_specific_followups_when_answer_is_unverified(monkeypatch):
     def fail_dispatch(*args, **kwargs):
         raise AssertionError("dispatch should not be called when no evidence candidates exist")
-    monkeypatch.setattr(verify_stage, "dispatch", fail_dispatch)
     merge_output = MergeOutput(
         synthesis=Synthesis(
@@ -193,16 +168,16 @@ def test_verify_generates_specific_followups_when_answer_is_unverified(monkeypat
     )
     bus = MessageBus()
-    result = run_verify(merge_output, [], Tracer(), bus=bus)
-    assert result.verification.unsupported_claims == [
         "The appendix reports a 12% latency reduction on unseen workloads.",
         "The architecture introduces a separate recovery worker for post-attack repair.",
     ]
-    assert result.verification.required_followups == [
         "Which result or metric in the document directly supports: The appendix reports a 12% latency reduction on unseen workloads?",
         "Where does the document explicitly describe: The architecture introduces a separate recovery worker for post-attack repair?",
     ]
     latest = bus.latest("gap_report")
     assert latest is not None
-    assert latest.payload["gaps"] == result.verification.required_followups

     Synthesis,
 )
 from pluto.bus import MessageBus
+from pluto.stages import evidence_check as evidence_check_stage
+from pluto.stages.evidence_check import _parse_evidence_check, run_evidence_check
 from pluto.tracer import Tracer
+def test_parse_evidence_check_dump():
     raw = """
     Here is the result:
     {
     }
     """
+    out = _parse_evidence_check(raw)
+    assert len(out.evidence_check.checked_claims) == 2
+    assert out.evidence_check.checked_claims[0].status.value == "supported"
+    assert out.evidence_check.checked_claims[0].evidence[0].doc_id == "paper_a"
+    assert out.evidence_check.unsupported_claims == ["The training set contains 2 million images."]
+    assert out.evidence_check.required_followups == ["Upload the appendix for dataset details."]
+def test_evidence_check_directly_supports_matching_claim_without_dispatch(monkeypatch):
     def fail_dispatch(*args, **kwargs):
         raise AssertionError("dispatch should not be called for an obvious direct evidence match")
+    monkeypatch.setattr(evidence_check_stage, "dispatch", fail_dispatch)
     merge_output = MergeOutput(
         synthesis=Synthesis(
         )
     ]
+    result = run_evidence_check(merge_output, extractions, Tracer())
+    assert len(result.evidence_check.checked_claims) == 1
+    assert result.evidence_check.checked_claims[0].status == ClaimStatus.SUPPORTED
+    assert result.evidence_check.checked_claims[0].evidence[0].doc_id == "paper_a"
+    assert result.evidence_check.unsupported_claims == []
+def test_evidence_check_suppresses_followups_for_single_unsupported_outlier(monkeypatch):
     def fail_dispatch(*args, **kwargs):
         raise AssertionError("dispatch should not be called for direct matches or suppressed followups")
+    monkeypatch.setattr(evidence_check_stage, "dispatch", fail_dispatch)
     merge_output = MergeOutput(
         synthesis=Synthesis(
     ]
     bus = MessageBus()
+    result = run_evidence_check(merge_output, extractions, Tracer(), bus=bus)
+    assert result.evidence_check.unsupported_claims == ["The appendix reports a 12% latency reduction on unseen workloads."]
+    assert result.evidence_check.required_followups == []
     assert bus.read(msg_type="gap_report") == []
+def test_evidence_check_generates_specific_followups_when_answer_is_unsupported(monkeypatch):
     def fail_dispatch(*args, **kwargs):
         raise AssertionError("dispatch should not be called when no evidence candidates exist")
+    monkeypatch.setattr(evidence_check_stage, "dispatch", fail_dispatch)
     merge_output = MergeOutput(
         synthesis=Synthesis(
     )
     bus = MessageBus()
+    result = run_evidence_check(merge_output, [], Tracer(), bus=bus)
+    assert result.evidence_check.unsupported_claims == [
         "The appendix reports a 12% latency reduction on unseen workloads.",
         "The architecture introduces a separate recovery worker for post-attack repair.",
     ]
+    assert result.evidence_check.required_followups == [
         "Which result or metric in the document directly supports: The appendix reports a 12% latency reduction on unseen workloads?",
         "Where does the document explicitly describe: The architecture introduces a separate recovery worker for post-attack repair?",
     ]
     latest = bus.latest("gap_report")
     assert latest is not None
+    assert latest.payload["gaps"] == result.evidence_check.required_followups

pytest.ini ADDED Viewed

	@@ -0,0 +1,7 @@

+[pytest]
+testpaths = mp1
+python_files = test_*.py
+addopts = -p no:doctest -p no:cacheprovider
+norecursedirs = .* __pycache__ output mp1/output pytest-cache-files-* mp1/pytest-cache-files-*
+markers =
+    live_api: hits external provider APIs and requires network plus valid credentials

requirements.txt CHANGED Viewed

@@ -6,8 +6,9 @@ uvicorn>=0.27.0
 python-dotenv>=1.0.0
 pytest>=8.0.0
 python-multipart>=0.0.5
-PyPDF2>=3.0.0
 python-docx>=1.1.0
 requests>=2.31.0
 openai>=1.0.0
 numpy>=1.24.0

 python-dotenv>=1.0.0
 pytest>=8.0.0
 python-multipart>=0.0.5
+pdfplumber>=0.10.0
 python-docx>=1.1.0
 requests>=2.31.0
 openai>=1.0.0
 numpy>=1.24.0
+psycopg2-binary>=2.9.0