yakilee Claude Opus 4.6 commited on
Commit
e05c99c
·
1 Parent(s): 6ba35c5

docs: add ARCHITECTURE directory with mermaid diagrams and process docs

Browse files

Generated from codebase analysis: main README with system diagram,
module communities, and dependency graph, plus 5 process files covering
patient journey, search refinement loop, dual-model evaluation,
UI state management, and Parlant bridge.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

architecture/README.md ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # TrialPath Architecture
2
+
3
+ AI-powered NSCLC clinical trial matching system. PoC phase.
4
+
5
+ ## Stats
6
+
7
+ | Metric | Value |
8
+ |--------|-------|
9
+ | Python files | 63 |
10
+ | Lines of code | ~7,000 |
11
+ | Test functions | 259 |
12
+ | Data model types | 22 |
13
+ | Parlant tools | 7 |
14
+ | UI pages / components | 5 / 6 |
15
+
16
+ ## System Diagram
17
+
18
+ ```mermaid
19
+ graph TB
20
+ subgraph UI["Streamlit UI"]
21
+ upload[1_upload]
22
+ profile[2_profile_review]
23
+ matching[3_trial_matching]
24
+ gaps[4_gap_analysis]
25
+ summary[5_summary]
26
+ end
27
+
28
+ subgraph Frontend_Services["Frontend Services"]
29
+ state_mgr[StateManager]
30
+ parlant_bridge[ParlantBridge]
31
+ mock_data[MockData]
32
+ end
33
+
34
+ subgraph Agent["Parlant Agent"]
35
+ journey[Journey<br/>5 states]
36
+ tools[7 Tools]
37
+ guidelines[10 Guidelines]
38
+ end
39
+
40
+ subgraph Backend_Services["Backend Services"]
41
+ medgemma[MedGemma 4B<br/>HF Endpoint]
42
+ gemini[Gemini 3 Pro<br/>LLM Planner]
43
+ mcp[ClinicalTrials<br/>MCP Client]
44
+ end
45
+
46
+ subgraph Models["Data Contracts"]
47
+ patient[PatientProfile]
48
+ anchors[SearchAnchors]
49
+ trial[TrialCandidate]
50
+ ledger[EligibilityLedger]
51
+ searchlog[SearchLog]
52
+ end
53
+
54
+ subgraph External["External APIs"]
55
+ hf_api[HuggingFace API]
56
+ gemini_api[Google Gemini API]
57
+ ct_api[ClinicalTrials.gov v2]
58
+ end
59
+
60
+ UI --> Frontend_Services
61
+ Frontend_Services -->|async bridge| Agent
62
+ Agent --> Backend_Services
63
+ Backend_Services --> Models
64
+ medgemma --> hf_api
65
+ gemini --> gemini_api
66
+ mcp --> ct_api
67
+ ```
68
+
69
+ ## Module Communities
70
+
71
+ ### 1. Data Models (`trialpath/models/`)
72
+
73
+ Shared language for the entire system. 5 Pydantic v2 contracts, 22 exported types.
74
+
75
+ | Contract | Purpose |
76
+ |----------|---------|
77
+ | `PatientProfile` | MedGemma output: demographics, diagnosis, biomarkers, labs, treatments, unknowns + evidence spans |
78
+ | `SearchAnchors` | Gemini-generated query params with relaxation order |
79
+ | `TrialCandidate` | Normalized ClinicalTrials.gov results |
80
+ | `EligibilityLedger` | Per-trial criterion assessment with traffic-light status + gaps |
81
+ | `SearchLog` | Iterative query refinement tracking (max 5 rounds) |
82
+
83
+ ### 2. Backend Services (`trialpath/services/`)
84
+
85
+ 4 service integrations, all currently stubbed.
86
+
87
+ | Service | File | External Dependency |
88
+ |---------|------|-------------------|
89
+ | `MedGemmaExtractor` | `medgemma_extractor.py` | HuggingFace Inference Endpoint |
90
+ | `GeminiPlanner` | `gemini_planner.py` | Google Gemini API (`google-genai`) |
91
+ | `ClinicalTrialsMCPClient` | `mcp_client.py` | ClinicalTrials.gov REST API v2 |
92
+ | `ParlantClient` | `parlant_client.py` | Parlant Engine REST API |
93
+
94
+ ### 3. Parlant Agent (`trialpath/agent/`)
95
+
96
+ Orchestration layer using Parlant SDK. Defines the 5-state journey, 7 tools, and 10 guidelines.
97
+
98
+ - **Tools** are thin async wrappers around backend services (lazy singleton pattern)
99
+ - **Journey** defines state machine with conditional transitions and loops
100
+ - **Guidelines** provide phase-specific and global behavioral rules
101
+
102
+ ### 4. Streamlit Frontend (`app/`)
103
+
104
+ 5-page journey mirroring Parlant states. Currently running on mock data.
105
+
106
+ | Page | State | Prerequisite |
107
+ |------|-------|-------------|
108
+ | Upload | INGEST | none |
109
+ | Profile Review | PRESCREEN | `patient_profile` |
110
+ | Trial Matching | VALIDATE_TRIALS | `trial_candidates` |
111
+ | Gap Analysis | GAP_FOLLOWUP | `eligibility_ledger` |
112
+ | Summary | SUMMARY | `eligibility_ledger` |
113
+
114
+ ### 5. Integration Tests (`tests/`)
115
+
116
+ Cross-module testing: 18 integration + 14 service integration + 7 e2e tests.
117
+
118
+ ## Cross-Community Dependencies
119
+
120
+ ```mermaid
121
+ graph LR
122
+ Models["Data Models"] --> Services["Backend Services"]
123
+ Models --> Agent["Parlant Agent"]
124
+ Models --> Frontend["Streamlit Frontend"]
125
+ Models --> Tests["Integration Tests"]
126
+ Services --> Agent
127
+ Agent -->|Parlant REST| Frontend
128
+ Config["config.py"] --> Services
129
+ Config --> Agent
130
+ MockData["mock_data.py"] --> Frontend
131
+ MockData --> Tests
132
+ ```
133
+
134
+ ## Key Processes
135
+
136
+ | Process | File |
137
+ |---------|------|
138
+ | [Patient Journey (5-state flow)](patient-journey.md) | `trialpath/agent/journey.py` |
139
+ | [Search Refinement Loop](search-refinement-loop.md) | `trialpath/agent/tools.py` |
140
+ | [Dual-Model Eligibility Evaluation](dual-model-evaluation.md) | `trialpath/agent/tools.py` |
141
+ | [UI State Management](ui-state-management.md) | `app/services/state_manager.py` |
142
+ | [Parlant Bridge (sync/async)](parlant-bridge.md) | `app/services/parlant_bridge.py` |
143
+
144
+ ## Configuration
145
+
146
+ All via environment variables (`trialpath/config.py`):
147
+
148
+ | Variable | Default | Used By |
149
+ |----------|---------|---------|
150
+ | `MEDGEMMA_ENDPOINT_URL` | HF cloud URL | MedGemmaExtractor |
151
+ | `HF_TOKEN` | `""` | MedGemmaExtractor |
152
+ | `GEMINI_API_KEY` | `""` | GeminiPlanner |
153
+ | `GEMINI_MODEL` | `gemini-3-pro` | GeminiPlanner |
154
+ | `MCP_URL` | `localhost:3000` | ClinicalTrialsMCPClient |
155
+ | `PARLANT_URL` | `localhost:8800` | ParlantClient |
156
+ | `SESSION_COST_BUDGET` | `$0.50` | Cost guardrail |
157
+
158
+ ## Implementation Status
159
+
160
+ | Component | Status |
161
+ |-----------|--------|
162
+ | Data Models (22 types) | **Complete** |
163
+ | MedGemma Extractor | Prompts ready, HF integration **pending** |
164
+ | Gemini Planner | Prompts stubbed, API integration **pending** |
165
+ | ClinicalTrials MCP | Wrapper done, needs running MCP server |
166
+ | Parlant Agent | Journey/tools/guidelines defined, live integration **pending** |
167
+ | Streamlit UI | **Complete** with mock data |
168
+ | Tests (259 total) | **Complete** |
architecture/dual-model-evaluation.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Dual-Model Eligibility Evaluation
2
+
3
+ **Entry point:** `trialpath/agent/tools.py` > `evaluate_trial_eligibility()`
4
+
5
+ Two-model approach for criterion-level eligibility assessment: MedGemma handles medical criteria, Gemini handles structural criteria.
6
+
7
+ ## Flow
8
+
9
+ ```mermaid
10
+ flowchart TD
11
+ A[PatientProfile + TrialCandidate] --> B[GeminiPlanner.slice_criteria]
12
+ B --> C[Atomic criteria list]
13
+ C --> D{For each criterion}
14
+ D --> E{Category?}
15
+ E -->|medical| F[MedGemmaExtractor.evaluate_medical_criterion]
16
+ E -->|structural| G[GeminiPlanner.evaluate_structural_criterion]
17
+ F --> H[CriterionAssessment]
18
+ G --> H
19
+ H --> I[GeminiPlanner.aggregate_assessments]
20
+ I --> J[EligibilityLedger]
21
+ ```
22
+
23
+ ## Steps
24
+
25
+ ### 1. Slice Criteria
26
+
27
+ `GeminiPlanner.slice_criteria(trial)` breaks trial eligibility text into atomic, evaluable items. Each criterion is tagged with a `category`:
28
+
29
+ - **`medical`**: Biomarker presence, lab values, staging, ECOG status, treatment history
30
+ - **`structural`**: Age range, geography, consent, insurance, enrollment status
31
+
32
+ ### 2. Per-Criterion Evaluation
33
+
34
+ Each criterion is routed to the appropriate model:
35
+
36
+ | Category | Model | Why |
37
+ |----------|-------|-----|
38
+ | `medical` | MedGemma 4B | Specialized for clinical/biomedical reasoning, temporal lab interpretation |
39
+ | `structural` | Gemini 3 Pro | General reasoning for demographic/administrative checks |
40
+
41
+ Each evaluation returns:
42
+ - `decision`: `met` / `not_met` / `unknown`
43
+ - `confidence`: 0.0 - 1.0
44
+ - `patient_evidence`: pointer to source document
45
+ - `trial_evidence`: pointer to criterion text
46
+ - `reasoning`: explanation
47
+
48
+ ### 3. Aggregate into Ledger
49
+
50
+ `GeminiPlanner.aggregate_assessments()` combines all criterion results into an `EligibilityLedger`:
51
+
52
+ - **`overall_assessment`**: `eligible` / `ineligible` / `needs_review`
53
+ - **`criteria_assessments[]`**: Full list of `CriterionAssessment` objects
54
+ - **`gaps[]`**: `GapItem` objects for `unknown`/`not_met` criteria with recommended actions
55
+ - **Traffic-light status**: Visual summary (green/yellow/red per criterion)
56
+
57
+ ## Key Data Contracts
58
+
59
+ - **`EligibilityLedger`**: Per-trial overall + criterion-level assessment
60
+ - **`CriterionAssessment`**: Single criterion verdict with evidence pointers
61
+ - **`GapItem`**: Actionable next step for a gap (what to provide, why it matters)
62
+ - **`TemporalCheck`**: For lab values with date requirements (e.g., "ANC >= 1.5 within 14 days")
63
+
64
+ ## Key Files
65
+
66
+ | File | Role |
67
+ |------|------|
68
+ | `trialpath/agent/tools.py:183-226` | `evaluate_trial_eligibility` tool |
69
+ | `trialpath/services/gemini_planner.py` | slice/evaluate/aggregate logic |
70
+ | `trialpath/services/medgemma_extractor.py` | medical criterion evaluation |
71
+ | `trialpath/models/eligibility_ledger.py` | EligibilityLedger + CriterionAssessment |
architecture/parlant-bridge.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Parlant Bridge (Sync/Async)
2
+
3
+ **Entry point:** `app/services/parlant_bridge.py`
4
+
5
+ Bridges synchronous Streamlit with the async Parlant agent engine via a dedicated thread pool.
6
+
7
+ ## Architecture
8
+
9
+ ```mermaid
10
+ sequenceDiagram
11
+ participant UI as Streamlit UI (sync)
12
+ participant Bridge as ParlantBridge
13
+ participant Pool as ThreadPoolExecutor
14
+ participant Client as ParlantClient (async)
15
+ participant Engine as Parlant Engine
16
+
17
+ UI->>Bridge: start_session()
18
+ Bridge->>Pool: _run_async()
19
+ Pool->>Client: create_session()
20
+ Client->>Engine: POST /sessions
21
+ Engine-->>Client: session_id
22
+ Client-->>Pool: session_id
23
+ Pool-->>Bridge: session_id
24
+ Bridge-->>UI: session_id
25
+
26
+ UI->>Bridge: send_and_poll(message)
27
+ Bridge->>Pool: _run_async()
28
+ Pool->>Client: send_message()
29
+ Client->>Engine: POST /sessions/{id}/messages
30
+ Pool->>Client: poll_events()
31
+ Client->>Engine: GET /sessions/{id}/events
32
+ Engine-->>Client: tool_events[]
33
+ Client-->>Pool: events
34
+ Pool-->>Bridge: events
35
+ Bridge->>Bridge: sync_journey_state()
36
+ Bridge-->>UI: updated state
37
+ ```
38
+
39
+ ## Sync/Async Bridge
40
+
41
+ Streamlit runs synchronously. Parlant client is fully async (`httpx.AsyncClient`). The bridge uses `concurrent.futures.ThreadPoolExecutor` to run async code from sync context:
42
+
43
+ ```python
44
+ # Simplified pattern
45
+ def _run_async(coro):
46
+ with ThreadPoolExecutor(max_workers=1) as pool:
47
+ return pool.submit(asyncio.run, coro).result()
48
+ ```
49
+
50
+ ## Event-to-State Mapping
51
+
52
+ `sync_journey_state()` parses Parlant tool events and updates `st.session_state`:
53
+
54
+ | Tool Event | Session State Key |
55
+ |------------|------------------|
56
+ | `extract_patient_profile` | `patient_profile_data` |
57
+ | `search_clinical_trials` | `trial_candidates_data` |
58
+ | `evaluate_trial_eligibility` | `eligibility_ledger_data` |
59
+ | `analyze_gaps` | `gap_analysis_data` |
60
+
61
+ ## Two Parlant Clients
62
+
63
+ The codebase has two separate Parlant clients:
64
+
65
+ | Client | Location | Purpose |
66
+ |--------|----------|---------|
67
+ | Backend | `trialpath/services/parlant_client.py` | Async REST wrapper for engine admin |
68
+ | Frontend | `app/services/parlant_client.py` | Session/event management for UI |
69
+
70
+ Both target the same Parlant engine at `PARLANT_URL` (default `localhost:8800`).
71
+
72
+ ## Key Files
73
+
74
+ | File | Role |
75
+ |------|------|
76
+ | `app/services/parlant_bridge.py` | Sync/async bridge + state sync |
77
+ | `app/services/parlant_client.py` | Frontend async REST client |
78
+ | `trialpath/services/parlant_client.py` | Backend async REST client |
79
+ | `trialpath/config.py` | `PARLANT_URL` configuration |
architecture/patient-journey.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Patient Journey (5-State Flow)
2
+
3
+ **Entry point:** `trialpath/agent/journey.py` > `create_clinical_trial_journey()`
4
+
5
+ The core orchestration process. A Parlant `Journey` with 5 states, conditional transitions, and one backward loop.
6
+
7
+ ## State Machine
8
+
9
+ ```mermaid
10
+ stateDiagram-v2
11
+ [*] --> INGEST
12
+ INGEST --> PRESCREEN : profile has minimum prescreen data
13
+ PRESCREEN --> VALIDATE_TRIALS : 1-50 results found
14
+ PRESCREEN --> PRESCREEN : refine (>50) or relax (0)
15
+ VALIDATE_TRIALS --> GAP_FOLLOWUP : all trials evaluated
16
+ GAP_FOLLOWUP --> SUMMARY : patient ready for summary
17
+ GAP_FOLLOWUP --> INGEST : patient uploads new documents
18
+ SUMMARY --> [*] : patient reviewed summary
19
+ ```
20
+
21
+ ## States
22
+
23
+ ### 1. INGEST (`ForkJourneyState`)
24
+
25
+ - **Action:** Extract patient profile from uploaded medical documents
26
+ - **Tools:** `extract_patient_profile`
27
+ - **Input:** PDF/image document URLs + patient metadata (age, sex)
28
+ - **Output:** `PatientProfile` (demographics, diagnosis, biomarkers, labs, treatments, unknowns)
29
+ - **Transition:** Advances to PRESCREEN when profile has minimum data
30
+
31
+ ### 2. PRESCREEN (`ForkJourneyState`)
32
+
33
+ - **Action:** Generate search anchors, query ClinicalTrials.gov, refine/relax iteratively
34
+ - **Tools:** `generate_search_anchors`, `search_clinical_trials`, `refine_search_query`, `relax_search_query`
35
+ - **Input:** `PatientProfile`
36
+ - **Output:** `SearchAnchors` + `TrialCandidate[]`
37
+ - **Loop:** Max 5 refinement rounds. Refine if >50 results, relax if 0 results.
38
+ - **Transition:** Advances to VALIDATE_TRIALS when 1-50 results found
39
+
40
+ ### 3. VALIDATE_TRIALS (`ToolJourneyState`)
41
+
42
+ - **Action:** Dual-model eligibility evaluation per trial
43
+ - **Tools:** `evaluate_trial_eligibility`
44
+ - **Input:** `PatientProfile` + `TrialCandidate`
45
+ - **Output:** `EligibilityLedger[]` (criterion-level verdicts + gaps)
46
+ - **Transition:** Advances to GAP_FOLLOWUP when all candidates evaluated
47
+
48
+ ### 4. GAP_FOLLOWUP (`ForkJourneyState`)
49
+
50
+ - **Action:** Analyze gaps, present actionable next steps
51
+ - **Tools:** `analyze_gaps`
52
+ - **Input:** `PatientProfile` + `EligibilityLedger[]`
53
+ - **Output:** `GapItem[]` (recommended actions)
54
+ - **Fork:** Patient can upload new documents (loop to INGEST) or proceed to SUMMARY
55
+
56
+ ### 5. SUMMARY (`ChatJourneyState`)
57
+
58
+ - **Action:** Present final summary, generate doctor packet
59
+ - **Tools:** none (chat-only)
60
+ - **Output:** Doctor Packet (JSON/Markdown export)
61
+ - **Transition:** END_JOURNEY when patient has reviewed
62
+
63
+ ## Data Flow
64
+
65
+ ```
66
+ Patient Document (PDF/image)
67
+ |
68
+ v
69
+ [INGEST] MedGemmaExtractor.extract()
70
+ --> PatientProfile
71
+ |
72
+ v
73
+ [PRESCREEN] GeminiPlanner.generate_search_anchors()
74
+ + ClinicalTrialsMCPClient.search()
75
+ + iterative refine/relax (max 5 rounds)
76
+ --> SearchAnchors --> TrialCandidate[]
77
+ |
78
+ v
79
+ [VALIDATE_TRIALS] GeminiPlanner.slice_criteria()
80
+ + dual-model evaluation
81
+ + GeminiPlanner.aggregate_assessments()
82
+ --> EligibilityLedger[]
83
+ |
84
+ v
85
+ [GAP_FOLLOWUP] GeminiPlanner.analyze_gaps()
86
+ --> GapItem[]
87
+ --> (optional: loop back to INGEST)
88
+ |
89
+ v
90
+ [SUMMARY] Final report generation
91
+ --> Doctor Packet
92
+ ```
93
+
94
+ ## Key Files
95
+
96
+ | File | Role |
97
+ |------|------|
98
+ | `trialpath/agent/journey.py` | State machine definition |
99
+ | `trialpath/agent/tools.py` | Tool implementations |
100
+ | `trialpath/agent/guidelines.py` | Phase-specific behavioral rules |
101
+ | `trialpath/agent/orchestrator.py` | Parlant PluginServer setup |
102
+ | `trialpath/agent/setup.py` | Agent + NLP services init |
architecture/search-refinement-loop.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Search Refinement Loop
2
+
3
+ **Entry point:** `trialpath/agent/tools.py` > PRESCREEN state tools
4
+
5
+ Iterative query refinement process that adjusts ClinicalTrials.gov queries until a manageable result set (1-50 trials) is found.
6
+
7
+ ## Flow
8
+
9
+ ```mermaid
10
+ flowchart TD
11
+ A[PatientProfile] --> B[generate_search_anchors]
12
+ B --> C[SearchAnchors v1]
13
+ C --> D[search_clinical_trials]
14
+ D --> E{Result count?}
15
+ E -->|>50| F[refine_search_query]
16
+ E -->|0| G[relax_search_query]
17
+ E -->|1-50| H[Proceed to VALIDATE_TRIALS]
18
+ F --> I{Round < 5?}
19
+ G --> I
20
+ I -->|Yes| D
21
+ I -->|No| H
22
+ ```
23
+
24
+ ## How It Works
25
+
26
+ 1. **Generate anchors:** Gemini converts `PatientProfile` into `SearchAnchors` (condition, biomarkers, stage, geography, phase filters, relaxation order)
27
+ 2. **Search:** MCP client queries ClinicalTrials.gov REST API v2
28
+ 3. **Evaluate count:**
29
+ - **>50 results:** Call `refine_search_query` -- Gemini tightens filters (add biomarker, narrow geography, specific phase)
30
+ - **0 results:** Call `relax_search_query` -- Gemini loosens filters following the relaxation order in SearchAnchors
31
+ - **1-50 results:** Proceed to trial validation
32
+ 4. **Loop guard:** Maximum 5 refinement rounds (tracked in `SearchLog`)
33
+
34
+ ## Tools Involved
35
+
36
+ | Tool | When | What |
37
+ |------|------|------|
38
+ | `generate_search_anchors` | Start | Profile -> SearchAnchors |
39
+ | `search_clinical_trials` | Each round | SearchAnchors -> TrialCandidate[] |
40
+ | `refine_search_query` | Too many results | Tighten SearchAnchors |
41
+ | `relax_search_query` | Zero results | Loosen SearchAnchors |
42
+
43
+ ## Key Data Contracts
44
+
45
+ - **`SearchAnchors`**: `condition`, `biomarkers[]`, `stage`, `geography`, `phase_filter`, `relaxation_order[]`
46
+ - **`SearchLog`**: Tracks each round with `SearchStep` (query params, result count, action taken)
47
+
48
+ ## Key Files
49
+
50
+ | File | Role |
51
+ |------|------|
52
+ | `trialpath/agent/tools.py:82-180` | Tool implementations |
53
+ | `trialpath/services/mcp_client.py` | ClinicalTrials.gov wrapper |
54
+ | `trialpath/services/gemini_planner.py` | Refine/relax logic |
55
+ | `trialpath/models/search_anchors.py` | SearchAnchors contract |
56
+ | `trialpath/models/search_log.py` | Refinement tracking |
architecture/ui-state-management.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # UI State Management
2
+
3
+ **Entry point:** `app/services/state_manager.py`
4
+
5
+ Streamlit session-based state management that mirrors the 5 Parlant journey states with prerequisite guards.
6
+
7
+ ## State Machine
8
+
9
+ ```mermaid
10
+ stateDiagram-v2
11
+ [*] --> INGEST : init_session_state()
12
+ INGEST --> PRESCREEN : patient_profile set
13
+ PRESCREEN --> VALIDATE_TRIALS : trial_candidates set
14
+ VALIDATE_TRIALS --> GAP_FOLLOWUP : eligibility_ledger set
15
+ GAP_FOLLOWUP --> SUMMARY : eligibility_ledger set
16
+ GAP_FOLLOWUP --> INGEST : reset_to_ingest()
17
+ ```
18
+
19
+ ## Session State Variables
20
+
21
+ | Key | Type | Default | Set By |
22
+ |-----|------|---------|--------|
23
+ | `journey_state` | `str` | `"INGEST"` | `advance_journey()` |
24
+ | `parlant_session_id` | `str | None` | `None` | Parlant bridge |
25
+ | `parlant_agent_id` | `str | None` | `None` | Parlant bridge |
26
+ | `parlant_session_active` | `bool` | `False` | Parlant bridge |
27
+ | `patient_profile` | `dict | None` | `None` | INGEST tools |
28
+ | `uploaded_files` | `list` | `[]` | Upload page |
29
+ | `search_anchors` | `dict | None` | `None` | PRESCREEN tools |
30
+ | `trial_candidates` | `list` | `[]` | PRESCREEN tools |
31
+ | `eligibility_ledger` | `list` | `[]` | VALIDATE tools |
32
+ | `last_event_offset` | `int` | `0` | Parlant bridge polling |
33
+
34
+ ## Key Functions
35
+
36
+ | Function | Purpose |
37
+ |----------|---------|
38
+ | `init_session_state()` | Initialize defaults, no overwrite |
39
+ | `get_current_journey_state()` | Read current state |
40
+ | `advance_journey(target)` | Forward-only transition with validation |
41
+ | `can_advance_to(target)` | Prerequisite check |
42
+ | `reset_to_ingest()` | Special backward transition for gap re-ingestion |
43
+ | `reset_session_state()` | Full reset to defaults |
44
+
45
+ ## Prerequisite Guards
46
+
47
+ | Target State | Requires |
48
+ |-------------|----------|
49
+ | PRESCREEN | `patient_profile` is set |
50
+ | VALIDATE_TRIALS | `patient_profile` is set |
51
+ | GAP_FOLLOWUP | `patient_profile` + `trial_candidates` |
52
+ | SUMMARY | `patient_profile` + `trial_candidates` + `eligibility_ledger` |
53
+
54
+ `advance_journey()` enforces forward-only movement (raises `ValueError` on backward). The only exception is `reset_to_ingest()` for the gap re-ingestion loop.
55
+
56
+ ## Page Mapping
57
+
58
+ | Page File | Journey State | Components Used |
59
+ |-----------|--------------|-----------------|
60
+ | `app/pages/1_upload.py` | INGEST | `file_uploader`, `disclaimer_banner` |
61
+ | `app/pages/2_profile_review.py` | PRESCREEN | `profile_card` |
62
+ | `app/pages/3_trial_matching.py` | VALIDATE_TRIALS | `trial_card` |
63
+ | `app/pages/4_gap_analysis.py` | GAP_FOLLOWUP | `gap_card` |
64
+ | `app/pages/5_summary.py` | SUMMARY | `profile_card`, `trial_card`, `gap_card` |
65
+
66
+ ## Key Files
67
+
68
+ | File | Role |
69
+ |------|------|
70
+ | `app/services/state_manager.py` | State machine + prerequisites |
71
+ | `streamlit_app.py` | Multi-page navigation entry point |
72
+ | `app/components/progress_tracker.py` | Visual state indicator |