Spaces:
Running
title: ClinicalMatch AI
emoji: π§¬
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: true
ClinicalMatch AI β Precision Clinical Trial Matching & Recruitment Agent
"Agents Assemble: Healthcare AI Endgame Challenge" β Prompt Opinion platform
Standards: FHIR R4 Β· MCP Β· A2A
80% of clinical trials fail to meet enrollment deadlines. 85% of eligible patients are never identified. This agent directly addresses that.
What it does
ClinicalMatch AI is a full-stack AI agent that matches patients to recruiting clinical trials using a knowledge graph, real-time data from ClinicalTrials.gov, and structured clinical eligibility scoring.
Key capabilities:
| Feature | Description |
|---|---|
| Eligibility Check | Individual enters raw clinical data (age, labs in SI units, biomarkers) β no patient ID required β and receives ranked, explainable trial matches |
| Trial Finder | Real-time search of ClinicalTrials.gov sorted by most recently updated; results auto-ingest into the knowledge graph |
| Graph Intelligence | Per-trial: eligible patient count, top biomarkers among matches, similar trials via graph-neighborhood walk |
| A2A Pipeline | 5-state orchestration (INGEST β PARSE β MATCH β SCORE β RECRUIT) for FHIR patient profiles |
| Recruitment Hub | Kanban board tracking patients through IDENTIFIED β ENROLLED; generates personalized outreach (PCP letter, patient email, social post) |
| GraphRAG | Natural language queries over the knowledge graph ("which patients are eligible for breast cancer trials?") |
| MCP Server | 6 tools callable by Prompt Opinion directly via stdio transport |
Architecture
Prompt Opinion Platform
β MCP Protocol (stdio)
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Server (mcp_server.py) β
β find_trials Β· screen_patient Β· match_patient β
β generate_outreach Β· get_analytics Β· summarize β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β A2A Orchestration
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (main.py, port 8000) β
β 30+ REST endpoints β
ββββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββββββ€
β CT.gov β FHIR R4 β Claude β Neo4j Graph β
β live API β adapter β LLM β RAG + match β
ββββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Next.js 16 Frontend (port 3000) β
β Trial Finder Β· Eligibility Check Β· Screening β
β Recruitment Hub Β· Dashboard Β· Map Β· GraphRAG β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Nginx (port 7860)
βΌ
HuggingFace Spaces
Data sources (all free, no auth):
| Source | Data |
|---|---|
| ClinicalTrials.gov v2 | Real recruiting NCT trials, sorted by recency |
| RxNorm (NIH) | Medication RxCUI codes |
| ICD-10 CM (NLM) | Cancer diagnosis codes |
| PubMed (NCBI) | Supporting literature PMIDs |
| OpenFDA | Drug labels and adverse events |
| Synthetic | 500 realistic patient profiles matched to real trials |
Graph Knowledge Base
After seeding, the Neo4j graph contains:
| Node type | Count | Key properties |
|---|---|---|
| Patient | 500 | age, sex, ECOG, condition, city, biomarkers[], medications[] |
| Trial | ~250 | NCT ID, eligibility criteria, phase, last_updated |
| Diagnosis | ~130 | ICD-10 codes across 10 oncology conditions |
| Biomarker | 20 | HER2+/β, EGFR, ALK, BRCA1/2, MSI-H, FLT3, etc. |
| Medication | 16 | Trastuzumab, Pembrolizumab, Olaparib, etc. |
| StudySite | ~200 | lat/lon coordinates |
| ELIGIBLE_FOR edges | ~9,100 | score, linking patients to trials |
The graph grows passively β every Trial Finder search automatically upserts new Trial and StudySite nodes. Every Eligibility Check submission (with "Save to graph" enabled) adds a new Patient node with biomarker edges.
Clinical Eligibility Check (SI Units)
The /intake page accepts raw clinical data β no patient ID or account required. Fields:
Demographics: Age (years), Sex, ECOG performance status (0β4), Disease stage (IβIV)
Biomarker status (toggles):
- Breast/Gynecologic: HER2+/β, ER+, PR+, BRCA1/2 mutation, Triple-Negative
- Lung (NSCLC): EGFR mutation, ALK, ROS1 rearrangement, PD-L1
- GI/Colorectal: MSI-High, KRAS wild-type, BRAF V600E
- Hematology: FLT3, IDH1/2, BCR-ABL
Lab values (SI units):
| Field | Unit | Conversion |
|---|---|---|
| Haemoglobin | g/dL | β |
| WBC | Γ10βΉ/L | β |
| ANC | Γ10βΉ/L | β |
| Platelets | Γ10βΉ/L | β |
| Creatinine | ΞΌmol/L | auto-converted Γ·88.4 β mg/dL for trial text |
| eGFR | mL/min/1.73mΒ² | β |
| Bilirubin | ΞΌmol/L | auto-converted Γ·17.1 β mg/dL for trial text |
| ALT / AST | U/L | β |
Matching score breakdown:
- Age 25 pts β compared against trial min/max age
- Sex 15 pts β compared against trial sex restriction
- ECOG 15 pts β extracted via regex from eligibility criteria text
- Biomarkers 30 pts β checks whether biomarker terms appear in trial eligibility text
- Lab values 15 pts β parses thresholds from text, converts SI units, checks patient values
Results are ranked by score with pass/fail/uncertain per criterion and direct ClinicalTrials.gov links.
Running Locally (no Docker)
# 1. Start Neo4j
docker run -d --name neo4j -p 7474:7474 -p 7687:7687 -e NEO4J_AUTH=neo4j/clinicalmatch2024 neo4j:5.18-community
# 2. Backend
cd backend
python -m venv venv && source venv/bin/activate && pip install -r requirements.txt
cp ../.env.example ../.env.local # fill in credentials
uvicorn main:app --reload --port 8000
# 3. Schema setup (once)
curl -X POST http://localhost:8000/setup
# 4. Seed graph data from live APIs (~15 min, ~250 real trials + 500 patients)
curl -X POST http://localhost:8000/seed
# 5. Frontend
cd frontend
npm install --legacy-peer-deps
npm run dev # http://localhost:3000 (uses --webpack, not Turbopack)
# 6. MCP server (for Prompt Opinion integration)
cd backend
python mcp_server.py
Running with Docker Compose
cp .env.example .env.local # fill in OPENAI_API_KEY etc.
docker compose up -d
# Wait ~60s for Neo4j to be healthy, then:
curl -X POST http://localhost:7860/setup
curl -X POST http://localhost:7860/seed
Services: app β http://localhost:7860 | API docs β http://localhost:7860/api/docs | Neo4j β http://localhost:7474
Deploying to HuggingFace Spaces
- Create a Space β Docker SDK β blank template
- Push repo to the Space:
git remote add hf https://huggingface.co/spaces/<username>/<space-name> git push hf main - Set Repository Secrets:
OPENAI_API_KEY = <aimlapi.com key> OPENAI_BASE_URL = https://ai.aimlapi.com/v1 OPENAI_MODEL = claude-opus-4-7 NEO4J_PASSWORD = clinicalmatch2024 - After first boot, seed data:
POST https://<space>.hf.space/seed
MCP Tools (Prompt Opinion integration)
python backend/mcp_server.py # stdio transport
| Tool | Arguments | Description |
|---|---|---|
find_trials |
condition, phase? |
Real-time trial search |
screen_patient |
patient_id, nct_id |
Eligibility screening |
match_patient_to_trials |
patient_id |
Top-N trial matches |
generate_recruitment_outreach |
patient_id, nct_id, channel |
Personalized outreach |
get_trial_analytics |
β | Enrollment funnel + KPIs |
summarize_trial_protocol |
nct_id |
AI-parsed protocol summary |
Key API Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/intake/match |
SI-unit intake β ranked trial matches |
| GET | /api/v1/intake/biomarkers |
Biomarker registry |
| GET | /api/v1/trials/search |
Real-time CT.gov search (recency-sorted, graph-enriched) |
| GET | /api/v1/trials/{nct_id}/intelligence |
Graph intelligence per trial |
| GET | /api/v1/graph/patients |
Query seeded patient IDs from Neo4j |
| POST | /api/v1/patients/{id}/screen/{nct_id} |
Screen FHIR patient against trial |
| POST | /api/v1/workflow/run |
Run full A2A pipeline |
| GET | /api/v1/analytics/kpi |
Dashboard KPIs |
| GET | /api/v1/map/data |
Site coordinates + patient clusters |
| POST | /api/v1/graph/query |
GraphRAG natural language query |
| POST | /seed |
Seed full graph from live APIs |
| GET | /api/v1/graph/stats |
Node and edge counts |
Full interactive docs: http://localhost:8000/docs
Environment Variables
| Variable | Description | Default |
|---|---|---|
NEO4J_URI |
Neo4j bolt URI | bolt://localhost:7687 |
NEO4J_USERNAME |
Neo4j username | neo4j |
NEO4J_PASSWORD |
Neo4j password | clinicalmatch2024 |
NEO4J_DATABASE |
Database name | neo4j |
OPENAI_API_KEY |
aimlapi.com API key | β |
OPENAI_BASE_URL |
LLM base URL | https://ai.aimlapi.com/v1 |
OPENAI_MODEL |
Model identifier | claude-opus-4-7 |
NEXT_PUBLIC_API_URL |
Frontend API base URL | "" (relative, via Nginx) |
Frontend Pages
| Route | Page | Description |
|---|---|---|
/ |
Trial Finder | Real-time CT.gov search, recency-sorted, graph intelligence on expand |
/intake |
Eligibility Check | SI-unit clinical intake form, no patient ID required |
/screening |
Patient Screening | FHIR patient + trial combobox, A2A pipeline with state tracker |
/recruitment |
Recruitment Hub | Kanban board, AI outreach generation (PCP / email / social) |
/dashboard |
Dashboard | KPI cards, enrollment funnel, demographics, site performance |
/map |
Site Map | Leaflet map of trial sites and patient density clusters |
/graph |
GraphRAG | Natural language queries over the knowledge graph |