Clinical Trial Design Assistant
Implementation Plan v1.0
Executive Summary
The Clinical Trial Design Assistant is an AI-powered chatbot that helps clinical researchers design superior clinical trials by analyzing successful examples from ClinicalTrials.gov. It provides evidence-based recommendations on trial parameters and regulatory strategiesβspecifically designed for rare diseases and orphan drug development.
Primary Users: Clinical researchers evaluating drug partnerships and designing superiority trials.
1. Problem & Solution Overview
| π¨ Current Challenges | β Our Solution |
|---|---|
| Information Overload 500,000+ trials make research slow. |
Evidence-Based Recommendations Automated retrieval of proven trial designs. |
| Rare Disease Complexity Limited TCL data for successful patterns. |
Orphan Drug Expertise Fallback logic for similar rare diseases. |
| Regulatory Uncertainty Unclear reasons for past FDA/EMA outcomes. |
Regulatory Intelligence Extracts specific objections and success factors. |
| Statistical Rigor Complex specialized expertise required. |
Statistical Guidance Automated sample size and power calculations. |
How It Works (Simple Example):
Clinician types: "TCL"
Chatbot returns:
β Recommended trial design parameters
β Sample size: 100-120 patients
β Primary endpoint: ORR
β Comparator: Physician's choice
β Evidence: Based on NCT01482962, NCT00901147
2. Architecture Overview
High-Level Architecture
System Flow: User β AI Agent β Web Search β ClinicalTrials.gov/PubMed β AI Analysis β Recommendations
flowchart LR
classDef large font-size:22px,stroke-width:2px;
A[User Question]:::large --> B[Claude<br/>with Web Search]:::large
B <--> C[Real-time<br/>Web Search]:::large
C <--> D[ClinicalTrials.gov<br/>+ PubMed]:::large
B --> E[Structured<br/>Recommendations]:::large
High-Level Processing Flow:
- User Query β Natural language question about trial design.
- Session Memory β Retrieve conversation history from
session_state. - Claude AI β Interprets intent with full context and constructs search strategy.
- Web Search β Claude uses native web search to query ClinicalTrials.gov and medical literature.
- Analysis β AI analyzes retrieved data using training rules.
- Output β Formatted report with cited evidence and design parameters.
- Memory Update β Store response in session for follow-up questions.
Detailed Architecture
flowchart TD
A[User Query] -->|1. Submit question| B[Session Memory Check]
B -->|2. Load context| C[Claude + System Prompt]
C -->|3. Web search queries| D[ClinicalTrials.gov<br/>+ PubMed]
D -->|4. Return search results| C
C -->|5. Analyze with guardrails| E{Validation}
E -->|Pass| F[Generate Recommendations]
E -->|Fail: No results| G[Orphan Drug Fallback]
E -->|Fail: Invalid data| H[Error Handler]
G -->|Retry with related diseases| D
H -->|Retry or show error| I[User Notification]
F -->|6. Format output| J[Streamlit Display]
J -->|7. Store in session| K[Session Memory]
K -->|8. Follow-up question?| A
K -->|Done| L[End]
style C fill:#e1f5ff
style E fill:#fff4e1
style F fill:#e8f5e9
style J fill:#f3e5f5
Step-by-Step Processing Details
Step 1: User Query Submission
Input: "Design a trial for R/R TCL"
β
Streamlit captures query + checks session state
Step 2: Session Memory Check
What happens: Check if there's previous conversation history
- If yes: Load the last 5 messages to maintain context
- If no: Start a fresh conversation
Purpose: Enables follow-up questions like "What about sample size?"
Step 3: Claude + System Prompt
# System prompt includes:
- Role definition
- Disease hierarchy (Appendix A)
- Trial design rules (Appendix B)
- Anti-hallucination rules
- Output format template
# Claude receives:
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": conversation_history + new_query}
]
Claude's Internal Process:
- Parse user intent (disease, trial phase, endpoint preference)
- Construct search query for connector
- Call
search_trials()tool with filters - Receive trial data (NCT IDs, protocols, outcomes)
- Apply guardrails from system prompt
- Generate recommendations
Step 4: Clinical Trials Connector
What happens: Claude uses its native web search tool to search ClinicalTrials.gov and retrieve trial data (NCT IDs, eligibility criteria, endpoints, sample sizes, outcomes).
See Appendix C for web search details.
Step 5: Validation with Guardrails
Validation Checks:
Results found?
- If no trials found β Trigger orphan drug fallback
- Search related diseases (AITL, ALCL-ALK-)
NCT IDs valid?
- Check each cited trial ID follows format: NCT + 8 digits
- Flag any invalid IDs
Disease match?
- Verify response mentions TCL or lymphoma
- Flag if disease mismatch detected
Realistic values?
- ORR must be between 0-100%
- Sample size must be reasonable (not >10,000)
- Flag unrealistic numbers
Apply trial design rules
- Check against Appendix B:
- Prior lines: 1 vs 2+
- Comparator: Single-agent salvage chemo, Single-agent novel, BBv, or Investigator's choice
- Endpoints: ORR/CRR/PFS valid for R/R TCL
Guardrail Actions:
- Pass β Proceed to recommendations
- Fail (no results) β Orphan drug fallback
- Fail (invalid data) β Error handler
Step 6: Orphan Drug Fallback
If no TCL trials found:
1. Search disease hierarchy (Appendix A)
2. Try related diseases:
- AITL (most common nodal TCL)
- ALCL-ALK- (CD30+)
- TFH-TCL (PI3K responsive)
3. Notify user: "No exact TCL trials. Showing related: AITL"
Step 7: Generate Recommendations
What happens: Claude synthesizes the trial data and generates a structured recommendation including inclusion/exclusion criteria, primary endpoint with rationale, sample size with power basis, comparator selection, and NCT citations for evidence.
See User Input/Output Flow section for example output format.
Step 8: Error Handling
Error Types:
1. API timeout β Retry 3x (1s, 2s, 4s backoff)
2. Rate limit β Wait + auto-retry
3. Connector down β Show cached example
4. Invalid response β Log + retry option
5. No results after fallback β "Unable to find trials"
Step 9: Streamlit Display
What happens:
- Display the response in the chat interface
- Add the response to session history for context
- User sees formatted recommendations with NCT citations
Step 10: Follow-up Loop
User can ask:
- "What about safety endpoints?"
- "Compare to BBv trial"
- "Show me AITL-specific trials"
β Loop back to Step 1 with full context
Key Architecture Principles
| Principle | Implementation |
|---|---|
| Simplicity | Claude handles all logic; no custom parsers |
| Guardrails | Embedded in system prompt + validation step |
| Conversational | Session memory enables follow-up questions |
| Evidence-based | All recommendations cite NCT IDs |
| Fallback logic | Orphan drug search for rare diseases |
| Error resilience | Retry logic + graceful degradation |
Technology Stack
| Component | Technology | Rationale |
|---|---|---|
| Frontend | Streamlit | Simple chat interface, no coding required for users |
| Memory | st.session_state |
Maintains conversation history across turns |
| AI Engine | Claude 3.5 Sonnet | Best-in-class medical text understanding |
| Data Access | Native Web Search | Claude's built-in web search tool (web_search_20250305) |
| Sources | ClinicalTrials.gov + PubMed | Real-time search of clinical trial registries and literature |
| Deployment | Docker on HuggingFace Spaces | Free hosting, web-accessible |
Processing Pipeline
sequenceDiagram
autonumber
participant User
participant App as Streamlit/Orchestrator
participant AI as Claude AI
participant Data as ClinicalTrials.gov
User->>App: Enter disease "TCL"
App->>AI: Process query
AI->>Data: search_trials(TCL)
Data-->>AI: Trial list (NCT IDs)
AI->>Data: Get detailed protocols
Data-->>AI: Protocol documents
AI->>AI: Analyze & Compare
AI->>AI: Generate recommendations
AI-->>App: Results
App-->>User: Display report
Project Structure
Clinical-Trial-Design-Assistant/
βββ app.py # Streamlit main entry + Claude integration + system prompt
βββ requirements.txt # Python dependencies
βββ Dockerfile # Container configuration
βββ README.md # Documentation
βββ CHANGELOG.md # Version history
βββ .gitignore # Git ignore file
Note: Claude handles all parsing, extraction, and analysis via web search. Single-file architecture for simplicity.
System Prompt Template
SYSTEM_PROMPT = """
You are a Clinical Trial Design Assistant specializing in R/R TCL.
ROLE:
- Help researchers design superiority trials
- Provide evidence-based recommendations from ClinicalTrials.gov
- Cite NCT IDs for all claims
RULES (NEVER VIOLATE):
1. NEVER invent trial data - only cite real NCT IDs
2. If no trials found, say "No matching trials found" and suggest related searches
3. Use orphan drug fallback for rare diseases with <5 results
4. Apply disease hierarchy (AITL > ALCL-ALK- > PTCL-NOS)
5. Validate against R/R TCL framework for all recommendations
6. If prior treatment lines not specified, ASK user before providing recommendations
7. If regulatory region (FDA/EMA) not specified, ASK user for target submission region
SCOPE:
- Focus: Nodal and extranodal TCL subtypes
- Out of scope: Cutaneous T-cell lymphomas (Mycosis Fungoides, SΓ©zary Syndrome, Primary Cutaneous ALCL, etc.) - redirect to CTCL-specific resources
DISEASE HIERARCHY:
- AITL: TET2/KMT2D mutations; responds to HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents
- ALCL-ALK-: CD30+; responds to brentuximab vedotin
- PTCL-NOS: Heterogeneous; worst R/R outcomes
- TFH-TCL: Responds to duvelisib (PI3K/delta), HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents
- Analyze separately: ALCL-ALK+ (best prognosis)
- Analyze separately: NKTCL/EATL/MEITL (rare, distinct biology)
COMPARATOR CATEGORIES:
- Single-agent salvage chemo (GDP/DHAP/ICE)
- Single-agent novel (pralatrexate/romidepsin)
- BBv (brentuximab + bendamustine)
- Investigator's choice
OUTPUT FORMAT:
- Recommended inclusion/exclusion criteria
- Primary endpoint with justification
- Sample size with power calculation basis
- Comparator with rationale
- NCT citations for evidence
"""
Error Handling
| Error | Handling |
|---|---|
| API timeout | Retry 3x with exponential backoff (1s, 2s, 4s) |
| Rate limit | Show "Please wait" message, auto-retry after delay |
| Connector unavailable | Display cached example response + "Try again later" |
| No results | Trigger orphan drug fallback β search related diseases |
| Invalid response | Log error, show "Unable to process" + retry option |
Response Validation
Before displaying results, validate:
- All NCT IDs cited exist (format check: NCT + 8 digits)
- Disease in response matches user query
- ORR/endpoint values are within realistic ranges
- No hallucinated drug names (cross-check against known TCL treatments)
User Input/Output Flow
Input:
User enters: "TCL" (disease name)
Optional: Drug of interest, trial type filter
Output:
1. Trial Discovery Report
- List of relevant trials (NCT IDs, sponsors, status)
- Trial outcomes summary
2. Feature Analysis
- Common inclusion/exclusion patterns in successful trials
- Endpoint selection trends
- Sample size ranges by trial phase
- Comparator arm choices
3. Recommendations
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RECOMMENDED TRIAL DESIGN PARAMETERS FOR TCL β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Inclusion Criteria: β
β β Relapsed/refractory TCL (β₯1 prior therapy) β
β β ECOG PS 0-2 β
β β Measurable disease per Lugano criteria β
β β
β Exclusion Criteria: β
β β Prior allogeneic transplant β
β β Active CNS involvement β
β β
β Primary Endpoint: ORR (recommended based on precedent) β
β Sample Size: 100-120 (based on similar trials) β
β Comparator: Physician's choice (pralatrexate, belinostat, β
β romidepsin, or gemcitabine-based) β
β β
β Evidence: Based on NCT01482962, NCT00901147, NCT01280526 β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3. Implementation Phases
Phase 1: Setup & Configuration
- Clinical Trials connector verified and available
- Obtain Anthropic API key
- Set up project structure (Streamlit + Python 3.11)
- Configure Docker environment
Phase 2: Core Development
- Implement conversation memory with
st.session_state - Implement anti-hallucination system prompt
- Build orphan drug fallback logic
- Develop trial data extraction pipeline
- Add error handling for API failures/timeouts
Phase 3: Testing & Refinement
- Test with TCL/Pralatrexate/Belinostat cases
- Validate anti-hallucination rules
- Verify statistical recommendations
- Clinical researcher user testing
Phase 4: Deployment
- Deploy to HuggingFace Spaces
- User documentation
- Handoff to clinical team
4. Post-MVP Enhancements
4.1 Competitive Landscape Tool
| Aspect | Details |
|---|---|
| Problem | Need to compare a candidate drug against competitors in efficacy/safety |
| Solution | Drug comparison feature with structured competitive analysis |
| Requested By | Vittoria |
Implementation Tasks:
- Add drug comparison feature
- Extract efficacy metrics (ORR, CRR, PFS, OS)
- Extract safety profiles (AEs, SAEs, discontinuation rates)
- Generate comparative analysis report with visualizations
4.2 Hybrid Retrieval Architecture
| Aspect | Details |
|---|---|
| Problem | Web search returns only 5-10 results; PTCL has 1,000+ NCT IDs |
| Solution | Combine ClinicalTrials.gov API + Web Search for complete coverage |
Architecture:
User Query
β
ββββΆ ClinicalTrials.gov API (complete, filtered, reproducible)
β
ββββΆ Claude Web Search (news, publications, context)
β
βΌ
Claude Synthesizes from BOTH
Implementation Tasks:
- Integrate ClinicalTrials.gov API v2
- Add query filters (phase, status, condition)
- Implement result pagination
- Merge API results with web search context
- Add audit logging for retrieved NCT IDs
Benefits:
| MVP | Hybrid |
|---|---|
| ~10 trials | All matching trials |
| Web-ranked | Clinically-filtered |
| Non-reproducible | Fully auditable |
5. Next Steps
Immediate Actions
- Obtain Anthropic API key
- Set up project structure
- Begin Phase 1 development
Appendix A: Disease Hierarchy
TCL (Parent)
βββ Nodal: AITL, ALCL-ALK+/-, PTCL-NOS, TFH-TCL
βββ Extranodal: NKTCL, EATL, MEITL
βββ Cutaneous (CTCL): Mycosis Fungoides, SΓ©zary Syndrome, Primary Cutaneous ALCL, Lymphomatoid Papulosis, Subcutaneous Panniculitis-like TCL, Primary Cutaneous Gamma-Delta TCL
Subtype-Specific Notes:
- AITL: TET2/KMT2D mutations; responds to HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents
- ALCL-ALK-: CD30+; responds to brentuximab vedotin
- ALCL-ALK+: Best prognosis (analyze separately if needed)
- PTCL-NOS: Heterogeneous; worst R/R outcomes
- TFH-TCL: Responds to duvelisib (PI3K/delta), HDAC inhibitors, EZH2 inhibitors, checkpoint inhibitors, and other targeted agents
- NKTCL/EATL/MEITL: Rare; distinct biology
Exclusions: Solid tumors mimicking TCL
Appendix B: R/R TCL Trial Design Rules
Patient Population
- Prior lines: 1 vs 2+ (changes prognosis significantly)
- Refractory definition: PD within 1 month of treatment end
- Relapsed vs primary refractory: Separate analysis (OS 1.97 vs 0.89 years)
- Transplant eligibility: Age, comorbidities, performance status, organ function
Transplant Strategy
- Transplant-eligible: Goal = salvage β bridge to allo-SCT (curative intent)
- Transplant-ineligible: Salvage is potentially palliative
- Conversion endpoint: % converting from ineligible β eligible
Comparator Categories
- Single-agent salvage chemo (GDP/DHAP/ICE)
- Single-agent novel (pralatrexate/romidepsin)
- BBv (brentuximab + bendamustine)
- Investigator's choice
Endpoints
- Primary: ORR (rapid), CRR (stronger signal), PFS (durability), TFS (transplant-ineligible), OS
- Secondary: DoR, TTR, transplant conversion, post-transplant outcomes, TTNT
- ORR threshold: 50% or higher vs 30% historical = superiority
- ICR preferred for regulatory; investigator-assessed OK for exploratory
Biomarkers
- TET2, DNMT3A, RHOA: These biomarkers predict response to HDAC inhibitors (common in AITL/TFH-TCL)
- TP53: Poor prognosis
- KMT2D: AITL-enriched; epigenetic modifier sensitivity
- CD30: Expression level correlates with targeted therapy response
- Early PET-CT (2-4 cycles): Metabolic response, identify progressors
Safety Considerations
- Acceptable toxicity for frail population
- Dose modifications for elderly/comorbid
- Cumulative organ toxicity monitoring (cardiac/renal/hepatic)
- Grade 3/4 cytopenia management
- Prophylactic antimicrobials (antibiotics/antivirals/antifungals)
Regulatory Pathways
- Accelerated approval: ORR primary + clinical benefit demonstration
- Traditional approval: PFS/OS primary
- Breakthrough therapy designation for orphan drugs
- Confirmatory trial typically required post-accelerated approval
Sample Size
- Historical controls: Use established benchmarks for each comparator category
- Power for: 50% or higher vs 30% (20% or greater difference)
- Dropout: Expect 5-10% screen failures in R/R population
- Consider subtype-stratified power (AITL vs ALCL-ALK- separately)
Appendix C: Web Search Details
Tool Used: web_search_20250305 (Claude's native web search)
How It Works:
- Claude autonomously constructs search queries based on user intent
- Searches ClinicalTrials.gov, PubMed, and other medical sources
- Returns structured results with URLs and content
- Content is encrypted (only Claude can read it internally)
Data Extracted:
- Trial metadata (NCT ID, sponsor, phase, status)
- Eligibility criteria
- Endpoints (primary/secondary)
- Sample sizes
- Published outcomes (ORR, PFS, OS)
Document Version: 1.0
Status: π In Progress