Spaces:
Running
Running
A newer version of the Gradio SDK is available:
6.1.0
ClinicalTrials.gov Tool: Current State & Future Improvements
Status: Currently Implemented Priority: High (Core Data Source for Drug Repurposing)
Current Implementation
What We Have (src/tools/clinicaltrials.py)
- V2 API search via
clinicaltrials.gov/api/v2/studies - Filters:
INTERVENTIONALstudy type,RECRUITINGstatus - Returns: NCT ID, title, conditions, interventions, phase, status
- Query preprocessing via shared
query_utils.py
Current Strengths
- Good Filtering: Already filtering for interventional + recruiting
- V2 API: Using the modern API (v1 deprecated)
- Phase Info: Extracting trial phases for drug development context
Current Limitations
- No Outcome Data: Missing primary/secondary outcomes
- No Eligibility Criteria: Missing inclusion/exclusion details
- No Sponsor Info: Missing who's running the trial
- No Result Data: For completed trials, no efficacy data
- Limited Drug Mapping: No integration with drug databases
API Capabilities We're Not Using
Fields We Could Request
# Current fields
fields = ["NCTId", "BriefTitle", "Condition", "InterventionName", "Phase", "OverallStatus"]
# Additional valuable fields
additional_fields = [
"PrimaryOutcomeMeasure", # What are they measuring?
"SecondaryOutcomeMeasure", # Secondary endpoints
"EligibilityCriteria", # Who can participate?
"LeadSponsorName", # Who's funding?
"ResultsFirstPostDate", # Has results?
"StudyFirstPostDate", # When started?
"CompletionDate", # When finished?
"EnrollmentCount", # Sample size
"InterventionDescription", # Drug details
"ArmGroupLabel", # Treatment arms
"InterventionOtherName", # Drug aliases
]
Filter Enhancements
# Current
aggFilters = "studyType:INTERVENTIONAL,status:RECRUITING"
# Could add
"status:RECRUITING,ACTIVE_NOT_RECRUITING,COMPLETED" # Include completed for results
"phase:PHASE2,PHASE3" # Only later-stage trials
"resultsFirstPostDateRange:2020-01-01_" # Trials with posted results
Recommended Improvements
Phase 1: Richer Metadata
EXTENDED_FIELDS = [
"NCTId",
"BriefTitle",
"OfficialTitle",
"Condition",
"InterventionName",
"InterventionDescription",
"InterventionOtherName", # Drug synonyms!
"Phase",
"OverallStatus",
"PrimaryOutcomeMeasure",
"EnrollmentCount",
"LeadSponsorName",
"StudyFirstPostDate",
]
Phase 2: Results Retrieval
For completed trials, we can get actual efficacy data:
async def get_trial_results(nct_id: str) -> dict | None:
"""Fetch results for completed trials."""
url = f"https://clinicaltrials.gov/api/v2/studies/{nct_id}"
params = {
"fields": "ResultsSection",
}
# Returns outcome measures and statistics
Phase 3: Drug Name Normalization
Map intervention names to standard identifiers:
# Problem: "Metformin", "Metformin HCl", "Glucophage" are the same drug
# Solution: Use RxNorm or DrugBank for normalization
async def normalize_drug_name(intervention: str) -> str:
"""Normalize drug name via RxNorm API."""
url = f"https://rxnav.nlm.nih.gov/REST/rxcui.json?name={intervention}"
# Returns standardized RxCUI
Integration Opportunities
With PubMed
Cross-reference trials with publications:
# ClinicalTrials.gov provides PMID links
# Can correlate trial results with published papers
With DrugBank/ChEMBL
Map interventions to:
- Mechanism of action
- Known targets
- Adverse effects
- Drug-drug interactions
Python Libraries to Consider
| Library | Purpose | Notes |
|---|---|---|
| pytrials | CT.gov wrapper | V2 API support unclear |
| clinicaltrials | Data tracking | More for analysis |
| drugbank-downloader | Drug mapping | Requires license |
API Quirks & Gotchas
- Rate Limiting: Undocumented, be conservative
- Pagination: Max 1000 results per request
- Field Names: Case-sensitive, camelCase
- Empty Results: Some fields may be null even if requested
- Status Changes: Trials change status frequently
Example Enhanced Query
async def search_drug_repurposing_trials(
drug_name: str,
condition: str,
include_completed: bool = True,
) -> list[Evidence]:
"""Search for trials repurposing a drug for a new condition."""
statuses = ["RECRUITING", "ACTIVE_NOT_RECRUITING"]
if include_completed:
statuses.append("COMPLETED")
params = {
"query.intr": drug_name,
"query.cond": condition,
"filter.overallStatus": ",".join(statuses),
"filter.studyType": "INTERVENTIONAL",
"fields": ",".join(EXTENDED_FIELDS),
"pageSize": 50,
}