# Clinical Trial Design Assistant **Implementation Plan v1.0** --- ## Executive Summary The **Clinical Trial Design Assistant** is an AI-powered chatbot that helps clinical researchers design superior clinical trials by analyzing successful examples from ClinicalTrials.gov. It provides evidence-based recommendations on trial parameters and regulatory strategiesβ€”specifically designed for rare diseases and orphan drug development. **Primary Users**: Clinical researchers evaluating drug partnerships and designing superiority trials. --- ## 1. Problem & Solution Overview | 🚨 **Current Challenges** | βœ… **Our Solution** | |---------------------------|---------------------| | **Information Overload**
500,000+ trials make research slow. | **Evidence-Based Recommendations**
Automated retrieval of proven trial designs. | | **Rare Disease Complexity**
Limited TCL data for successful patterns. | **Orphan Drug Expertise**
Fallback logic for similar rare diseases. | | **Regulatory Uncertainty**
Unclear reasons for past FDA/EMA outcomes. | **Regulatory Intelligence**
Extracts specific objections and success factors. | | **Statistical Rigor**
Complex specialized expertise required. | **Statistical Guidance**
Automated sample size and power calculations. | **How It Works (Simple Example)**: ``` Clinician types: "TCL" Chatbot returns: β†’ Recommended trial design parameters β†’ Sample size: 100-120 patients β†’ Primary endpoint: ORR β†’ Comparator: Physician's choice β†’ Evidence: Based on NCT01482962, NCT00901147 ``` --- ## 2. Architecture Overview ### High-Level Architecture **System Flow**: User β†’ AI Agent β†’ Web Search β†’ ClinicalTrials.gov/PubMed β†’ AI Analysis β†’ Recommendations ```mermaid flowchart LR classDef large font-size:22px,stroke-width:2px; A[User Question]:::large --> B[Claude
with Web Search]:::large B <--> C[Real-time
Web Search]:::large C <--> D[ClinicalTrials.gov
+ PubMed]:::large B --> E[Structured
Recommendations]:::large ``` **High-Level Processing Flow**: 1. **User Query** β†’ Natural language question about trial design. 2. **Session Memory** β†’ Retrieve conversation history from `session_state`. 3. **Claude AI** β†’ Interprets intent with full context and constructs search strategy. 4. **Web Search** β†’ Claude uses native web search to query ClinicalTrials.gov and medical literature. 5. **Analysis** β†’ AI analyzes retrieved data using training rules. 6. **Output** β†’ Formatted report with cited evidence and design parameters. 7. **Memory Update** β†’ Store response in session for follow-up questions. --- ### Detailed Architecture ```mermaid flowchart TD A[User Query] -->|1. Submit question| B[Session Memory Check] B -->|2. Load context| C[Claude + System Prompt] C -->|3. Web search queries| D[ClinicalTrials.gov
+ PubMed] D -->|4. Return search results| C C -->|5. Analyze with guardrails| E{Validation} E -->|Pass| F[Generate Recommendations] E -->|Fail: No results| G[Orphan Drug Fallback] E -->|Fail: Invalid data| H[Error Handler] G -->|Retry with related diseases| D H -->|Retry or show error| I[User Notification] F -->|6. Format output| J[Streamlit Display] J -->|7. Store in session| K[Session Memory] K -->|8. Follow-up question?| A K -->|Done| L[End] style C fill:#e1f5ff style E fill:#fff4e1 style F fill:#e8f5e9 style J fill:#f3e5f5 ``` --- ### Step-by-Step Processing Details #### **Step 1: User Query Submission** ``` Input: "Design a trial for R/R TCL" ↓ Streamlit captures query + checks session state ``` --- #### **Step 2: Session Memory Check** **What happens**: Check if there's previous conversation history - **If yes**: Load the last 5 messages to maintain context - **If no**: Start a fresh conversation **Purpose**: Enables follow-up questions like "What about sample size?" --- #### **Step 3: Claude + System Prompt** ```python # System prompt includes: - Role definition - Disease hierarchy (Appendix A) - Trial design rules (Appendix B) - Anti-hallucination rules - Output format template # Claude receives: messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": conversation_history + new_query} ] ``` **Claude's Internal Process**: 1. Parse user intent (disease, trial phase, endpoint preference) 2. Construct search query for connector 3. Call `search_trials()` tool with filters 4. Receive trial data (NCT IDs, protocols, outcomes) 5. Apply guardrails from system prompt 6. Generate recommendations --- #### **Step 4: Clinical Trials Connector** **What happens**: Claude uses its native web search tool to search ClinicalTrials.gov and retrieve trial data (NCT IDs, eligibility criteria, endpoints, sample sizes, outcomes). *See Appendix C for web search details.* --- #### **Step 5: Validation with Guardrails** **Validation Checks**: 1. **Results found?** - If no trials found β†’ Trigger orphan drug fallback - Search related diseases (AITL, ALCL-ALK-) 2. **NCT IDs valid?** - Check each cited trial ID follows format: NCT + 8 digits - Flag any invalid IDs 3. **Disease match?** - Verify response mentions TCL or lymphoma - Flag if disease mismatch detected 4. **Realistic values?** - ORR must be between 0-100% - Sample size must be reasonable (not >10,000) - Flag unrealistic numbers 5. **Apply trial design rules** - Check against Appendix B: - Prior lines: 1 vs 2+ - Comparator: Single-agent salvage chemo, Single-agent novel, BBv, or Investigator's choice - Endpoints: ORR/CRR/PFS valid for R/R TCL **Guardrail Actions**: - **Pass** β†’ Proceed to recommendations - **Fail (no results)** β†’ Orphan drug fallback - **Fail (invalid data)** β†’ Error handler --- #### **Step 6: Orphan Drug Fallback** ``` If no TCL trials found: 1. Search disease hierarchy (Appendix A) 2. Try related diseases: - AITL (most common nodal TCL) - ALCL-ALK- (CD30+) - TFH-TCL (PI3K responsive) 3. Notify user: "No exact TCL trials. Showing related: AITL" ``` --- #### **Step 7: Generate Recommendations** **What happens**: Claude synthesizes the trial data and generates a structured recommendation including inclusion/exclusion criteria, primary endpoint with rationale, sample size with power basis, comparator selection, and NCT citations for evidence. *See User Input/Output Flow section for example output format.* --- #### **Step 8: Error Handling** ``` Error Types: 1. API timeout β†’ Retry 3x (1s, 2s, 4s backoff) 2. Rate limit β†’ Wait + auto-retry 3. Connector down β†’ Show cached example 4. Invalid response β†’ Log + retry option 5. No results after fallback β†’ "Unable to find trials" ``` --- #### **Step 9: Streamlit Display** **What happens**: - Display the response in the chat interface - Add the response to session history for context - User sees formatted recommendations with NCT citations --- #### **Step 10: Follow-up Loop** ``` User can ask: - "What about safety endpoints?" - "Compare to BBv trial" - "Show me AITL-specific trials" β†’ Loop back to Step 1 with full context ``` --- ### Key Architecture Principles | Principle | Implementation | |-----------|----------------| | **Simplicity** | Claude handles all logic; no custom parsers | | **Guardrails** | Embedded in system prompt + validation step | | **Conversational** | Session memory enables follow-up questions | | **Evidence-based** | All recommendations cite NCT IDs | | **Fallback logic** | Orphan drug search for rare diseases | | **Error resilience** | Retry logic + graceful degradation | --- ### Technology Stack | Component | Technology | Rationale | |-----------|------------|-----------| | **Frontend** | Streamlit | Simple chat interface, no coding required for users | | **Memory** | `st.session_state` | Maintains conversation history across turns | | **AI Engine** | Claude 3.5 Sonnet | Best-in-class medical text understanding | | **Data Access** | Native Web Search | Claude's built-in web search tool (`web_search_20250305`) | | **Sources** | ClinicalTrials.gov + PubMed | Real-time search of clinical trial registries and literature | | **Deployment** | Docker on HuggingFace Spaces | Free hosting, web-accessible | --- ### Processing Pipeline ```mermaid sequenceDiagram autonumber participant User participant App as Streamlit/Orchestrator participant AI as Claude AI participant Data as ClinicalTrials.gov User->>App: Enter disease "TCL" App->>AI: Process query AI->>Data: search_trials(TCL) Data-->>AI: Trial list (NCT IDs) AI->>Data: Get detailed protocols Data-->>AI: Protocol documents AI->>AI: Analyze & Compare AI->>AI: Generate recommendations AI-->>App: Results App-->>User: Display report ``` --- ### Project Structure ``` Clinical-Trial-Design-Assistant/ β”œβ”€β”€ app.py # Streamlit main entry + Claude integration + system prompt β”œβ”€β”€ requirements.txt # Python dependencies β”œβ”€β”€ Dockerfile # Container configuration β”œβ”€β”€ README.md # Documentation β”œβ”€β”€ CHANGELOG.md # Version history └── .gitignore # Git ignore file ``` **Note**: Claude handles all parsing, extraction, and analysis via web search. Single-file architecture for simplicity. --- ### System Prompt Template ```python SYSTEM_PROMPT = """ You are a Clinical Trial Design Assistant specializing in R/R TCL. ROLE: - Help researchers design superiority trials - Provide evidence-based recommendations from ClinicalTrials.gov - Cite NCT IDs for all claims RULES (NEVER VIOLATE): 1. NEVER invent trial data - only cite real NCT IDs 2. If no trials found, say "No matching trials found" and suggest related searches 3. Use orphan drug fallback for rare diseases with <5 results 4. Apply disease hierarchy (AITL > ALCL-ALK- > PTCL-NOS) 5. Validate against R/R TCL framework for all recommendations 6. If prior treatment lines not specified, ASK user before providing recommendations 7. If regulatory region (FDA/EMA) not specified, ASK user for target submission region SCOPE: - Focus: Nodal and extranodal TCL subtypes - Out of scope: Cutaneous T-cell lymphomas (Mycosis Fungoides, SΓ©zary Syndrome, Primary Cutaneous ALCL, etc.) - redirect to CTCL-specific resources DISEASE HIERARCHY: - AITL: TET2/KMT2D mutations; responds to HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents - ALCL-ALK-: CD30+; responds to brentuximab vedotin - PTCL-NOS: Heterogeneous; worst R/R outcomes - TFH-TCL: Responds to duvelisib (PI3K/delta), HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents - Analyze separately: ALCL-ALK+ (best prognosis) - Analyze separately: NKTCL/EATL/MEITL (rare, distinct biology) COMPARATOR CATEGORIES: - Single-agent salvage chemo (GDP/DHAP/ICE) - Single-agent novel (pralatrexate/romidepsin) - BBv (brentuximab + bendamustine) - Investigator's choice OUTPUT FORMAT: - Recommended inclusion/exclusion criteria - Primary endpoint with justification - Sample size with power calculation basis - Comparator with rationale - NCT citations for evidence """ ``` --- ### Error Handling | Error | Handling | |-------|----------| | **API timeout** | Retry 3x with exponential backoff (1s, 2s, 4s) | | **Rate limit** | Show "Please wait" message, auto-retry after delay | | **Connector unavailable** | Display cached example response + "Try again later" | | **No results** | Trigger orphan drug fallback β†’ search related diseases | | **Invalid response** | Log error, show "Unable to process" + retry option | --- ### Response Validation Before displaying results, validate: - [ ] All NCT IDs cited exist (format check: NCT + 8 digits) - [ ] Disease in response matches user query - [ ] ORR/endpoint values are within realistic ranges - [ ] No hallucinated drug names (cross-check against known TCL treatments) --- ### User Input/Output Flow **Input**: ``` User enters: "TCL" (disease name) Optional: Drug of interest, trial type filter ``` **Output**: **1. Trial Discovery Report** - List of relevant trials (NCT IDs, sponsors, status) - Trial outcomes summary **2. Feature Analysis** - Common inclusion/exclusion patterns in successful trials - Endpoint selection trends - Sample size ranges by trial phase - Comparator arm choices **3. Recommendations** ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ RECOMMENDED TRIAL DESIGN PARAMETERS FOR TCL β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Inclusion Criteria: β”‚ β”‚ βœ“ Relapsed/refractory TCL (β‰₯1 prior therapy) β”‚ β”‚ βœ“ ECOG PS 0-2 β”‚ β”‚ βœ“ Measurable disease per Lugano criteria β”‚ β”‚ β”‚ β”‚ Exclusion Criteria: β”‚ β”‚ βœ— Prior allogeneic transplant β”‚ β”‚ βœ— Active CNS involvement β”‚ β”‚ β”‚ β”‚ Primary Endpoint: ORR (recommended based on precedent) β”‚ β”‚ Sample Size: 100-120 (based on similar trials) β”‚ β”‚ Comparator: Physician's choice (pralatrexate, belinostat, β”‚ β”‚ romidepsin, or gemcitabine-based) β”‚ β”‚ β”‚ β”‚ Evidence: Based on NCT01482962, NCT00901147, NCT01280526 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## 3. Implementation Phases ### Phase 1: Setup & Configuration - [x] Clinical Trials connector verified and available - [ ] Obtain Anthropic API key - [ ] Set up project structure (Streamlit + Python 3.11) - [ ] Configure Docker environment ### Phase 2: Core Development - [ ] Implement conversation memory with `st.session_state` - [ ] Implement anti-hallucination system prompt - [ ] Build orphan drug fallback logic - [ ] Develop trial data extraction pipeline - [ ] Add error handling for API failures/timeouts ### Phase 3: Testing & Refinement - [ ] Test with TCL/Pralatrexate/Belinostat cases - [ ] Validate anti-hallucination rules - [ ] Verify statistical recommendations - [ ] Clinical researcher user testing ### Phase 4: Deployment - [ ] Deploy to HuggingFace Spaces - [ ] User documentation - [ ] Handoff to clinical team --- ## 4. Post-MVP Enhancements ### 4.1 Competitive Landscape Tool | Aspect | Details | |--------|---------| | **Problem** | Need to compare a candidate drug against competitors in efficacy/safety | | **Solution** | Drug comparison feature with structured competitive analysis | | **Requested By** | Vittoria | **Implementation Tasks**: - [ ] Add drug comparison feature - [ ] Extract efficacy metrics (ORR, CRR, PFS, OS) - [ ] Extract safety profiles (AEs, SAEs, discontinuation rates) - [ ] Generate comparative analysis report with visualizations --- ### 4.2 Hybrid Retrieval Architecture | Aspect | Details | |--------|---------| | **Problem** | Web search returns only 5-10 results; PTCL has 1,000+ NCT IDs | | **Solution** | Combine ClinicalTrials.gov API + Web Search for complete coverage | **Architecture**: ``` User Query β”‚ β”œβ”€β”€β–Ά ClinicalTrials.gov API (complete, filtered, reproducible) β”‚ └──▢ Claude Web Search (news, publications, context) β”‚ β–Ό Claude Synthesizes from BOTH ``` **Implementation Tasks**: - [ ] Integrate ClinicalTrials.gov API v2 - [ ] Add query filters (phase, status, condition) - [ ] Implement result pagination - [ ] Merge API results with web search context - [ ] Add audit logging for retrieved NCT IDs **Benefits**: | MVP | Hybrid | |-----|--------| | ~10 trials | All matching trials | | Web-ranked | Clinically-filtered | | Non-reproducible | Fully auditable | --- ## 5. Next Steps ### Immediate Actions 1. Obtain Anthropic API key 2. Set up project structure 3. Begin Phase 1 development --- ## Appendix A: Disease Hierarchy ``` TCL (Parent) β”œβ”€β”€ Nodal: AITL, ALCL-ALK+/-, PTCL-NOS, TFH-TCL β”œβ”€β”€ Extranodal: NKTCL, EATL, MEITL └── Cutaneous (CTCL): Mycosis Fungoides, SΓ©zary Syndrome, Primary Cutaneous ALCL, Lymphomatoid Papulosis, Subcutaneous Panniculitis-like TCL, Primary Cutaneous Gamma-Delta TCL ``` **Subtype-Specific Notes**: - AITL: TET2/KMT2D mutations; responds to HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents - ALCL-ALK-: CD30+; responds to brentuximab vedotin - ALCL-ALK+: Best prognosis (analyze separately if needed) - PTCL-NOS: Heterogeneous; worst R/R outcomes - TFH-TCL: Responds to duvelisib (PI3K/delta), HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents - NKTCL/EATL/MEITL: Rare; distinct biology **Exclusions**: Solid tumors mimicking TCL --- ## Appendix B: R/R TCL Trial Design Rules ### Patient Population - Prior lines: 1 vs 2+ (changes prognosis significantly) - Refractory definition: PD within 1 month of treatment end - Relapsed vs primary refractory: Separate analysis (OS 1.97 vs 0.89 years) - Transplant eligibility: Age, comorbidities, performance status, organ function ### Transplant Strategy - Transplant-eligible: Goal = salvage β†’ bridge to allo-SCT (curative intent) - Transplant-ineligible: Salvage is potentially palliative - Conversion endpoint: % converting from ineligible β†’ eligible ### Comparator Categories - Single-agent salvage chemo (GDP/DHAP/ICE) - Single-agent novel (pralatrexate/romidepsin) - BBv (brentuximab + bendamustine) - Investigator's choice ### Endpoints - **Primary**: ORR (rapid), CRR (stronger signal), PFS (durability), TFS (transplant-ineligible), OS - **Secondary**: DoR, TTR, transplant conversion, post-transplant outcomes, TTNT - ORR threshold: 50% or higher vs 30% historical = superiority - ICR preferred for regulatory; investigator-assessed OK for exploratory ### Biomarkers - TET2, DNMT3A, RHOA: These biomarkers predict response to HDAC inhibitors (common in AITL/TFH-TCL) - TP53: Poor prognosis - KMT2D: AITL-enriched; epigenetic modifier sensitivity - CD30: Expression level correlates with targeted therapy response - Early PET-CT (2-4 cycles): Metabolic response, identify progressors ### Safety Considerations - Acceptable toxicity for frail population - Dose modifications for elderly/comorbid - Cumulative organ toxicity monitoring (cardiac/renal/hepatic) - Grade 3/4 cytopenia management - Prophylactic antimicrobials (antibiotics/antivirals/antifungals) ### Regulatory Pathways - Accelerated approval: ORR primary + clinical benefit demonstration - Traditional approval: PFS/OS primary - Breakthrough therapy designation for orphan drugs - Confirmatory trial typically required post-accelerated approval ### Sample Size - Historical controls: Use established benchmarks for each comparator category - Power for: 50% or higher vs 30% (20% or greater difference) - Dropout: Expect 5-10% screen failures in R/R population - Consider subtype-stratified power (AITL vs ALCL-ALK- separately) --- ## Appendix C: Web Search Details **Tool Used**: `web_search_20250305` (Claude's native web search) **How It Works**: - Claude autonomously constructs search queries based on user intent - Searches ClinicalTrials.gov, PubMed, and other medical sources - Returns structured results with URLs and content - Content is encrypted (only Claude can read it internally) **Data Extracted**: - Trial metadata (NCT ID, sponsor, phase, status) - Eligibility criteria - Endpoints (primary/secondary) - Sample sizes - Published outcomes (ORR, PFS, OS) --- **Document Version**: 1.0 **Status**: πŸ”„ In Progress