Spaces:
Running
Running
Initial deployment: ClinicalMatch AI v2.0 β FHIR R4 Β· MCP (9 tools) Β· A2A workflow Β· SHARP compliance Β· 100k synthetic patients Β· Neo4j graph Β· GraphRAG chatbot
59abb4f | title: ClinicalMatch AI | |
| emoji: 𧬠| |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: true | |
| # ClinicalMatch AI β Precision Clinical Trial Matching & Recruitment Agent | |
| **"Agents Assemble: Healthcare AI Endgame Challenge"** β Prompt Opinion platform | |
| Standards: **FHIR R4 Β· MCP Β· A2A** | |
| > 80% of clinical trials fail to meet enrollment deadlines. 85% of eligible patients are never identified. This agent directly addresses that. | |
| --- | |
| ## What it does | |
| ClinicalMatch AI is a full-stack AI agent that matches patients to recruiting clinical trials using a knowledge graph, real-time data from ClinicalTrials.gov, and structured clinical eligibility scoring. | |
| **Key capabilities:** | |
| | Feature | Description | | |
| |---|---| | |
| | **Eligibility Check** | Individual enters raw clinical data (age, labs in SI units, biomarkers) β no patient ID required β and receives ranked, explainable trial matches | | |
| | **Trial Finder** | Real-time search of ClinicalTrials.gov sorted by most recently updated; results auto-ingest into the knowledge graph | | |
| | **Graph Intelligence** | Per-trial: eligible patient count, top biomarkers among matches, similar trials via graph-neighborhood walk | | |
| | **A2A Pipeline** | 5-state orchestration (INGEST β PARSE β MATCH β SCORE β RECRUIT) for FHIR patient profiles | | |
| | **Recruitment Hub** | Kanban board tracking patients through IDENTIFIED β ENROLLED; generates personalized outreach (PCP letter, patient email, social post) | | |
| | **GraphRAG** | Natural language queries over the knowledge graph ("which patients are eligible for breast cancer trials?") | | |
| | **MCP Server** | 6 tools callable by Prompt Opinion directly via stdio transport | | |
| --- | |
| ## Architecture | |
| ``` | |
| Prompt Opinion Platform | |
| β MCP Protocol (stdio) | |
| βΌ | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β MCP Server (mcp_server.py) β | |
| β find_trials Β· screen_patient Β· match_patient β | |
| β generate_outreach Β· get_analytics Β· summarize β | |
| ββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ | |
| β A2A Orchestration | |
| βΌ | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β FastAPI Backend (main.py, port 8000) β | |
| β 30+ REST endpoints β | |
| ββββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββββββ€ | |
| β CT.gov β FHIR R4 β Claude β Neo4j Graph β | |
| β live API β adapter β LLM β RAG + match β | |
| ββββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββββββββββββ | |
| β | |
| βΌ | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Next.js 16 Frontend (port 3000) β | |
| β Trial Finder Β· Eligibility Check Β· Screening β | |
| β Recruitment Hub Β· Dashboard Β· Map Β· GraphRAG β | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Nginx (port 7860) | |
| βΌ | |
| HuggingFace Spaces | |
| ``` | |
| **Data sources (all free, no auth):** | |
| | Source | Data | | |
| |---|---| | |
| | ClinicalTrials.gov v2 | Real recruiting NCT trials, sorted by recency | | |
| | RxNorm (NIH) | Medication RxCUI codes | | |
| | ICD-10 CM (NLM) | Cancer diagnosis codes | | |
| | PubMed (NCBI) | Supporting literature PMIDs | | |
| | OpenFDA | Drug labels and adverse events | | |
| | Synthetic | 500 realistic patient profiles matched to real trials | | |
| --- | |
| ## Graph Knowledge Base | |
| After seeding, the Neo4j graph contains: | |
| | Node type | Count | Key properties | | |
| |---|---|---| | |
| | Patient | 500 | age, sex, ECOG, condition, city, biomarkers[], medications[] | | |
| | Trial | ~250 | NCT ID, eligibility criteria, phase, last_updated | | |
| | Diagnosis | ~130 | ICD-10 codes across 10 oncology conditions | | |
| | Biomarker | 20 | HER2+/β, EGFR, ALK, BRCA1/2, MSI-H, FLT3, etc. | | |
| | Medication | 16 | Trastuzumab, Pembrolizumab, Olaparib, etc. | | |
| | StudySite | ~200 | lat/lon coordinates | | |
| | **ELIGIBLE_FOR edges** | **~9,100** | score, linking patients to trials | | |
| The graph grows passively β every Trial Finder search automatically upserts new Trial and StudySite nodes. Every Eligibility Check submission (with "Save to graph" enabled) adds a new Patient node with biomarker edges. | |
| --- | |
| ## Clinical Eligibility Check (SI Units) | |
| The `/intake` page accepts raw clinical data β no patient ID or account required. Fields: | |
| **Demographics:** Age (years), Sex, ECOG performance status (0β4), Disease stage (IβIV) | |
| **Biomarker status (toggles):** | |
| - Breast/Gynecologic: HER2+/β, ER+, PR+, BRCA1/2 mutation, Triple-Negative | |
| - Lung (NSCLC): EGFR mutation, ALK, ROS1 rearrangement, PD-L1 | |
| - GI/Colorectal: MSI-High, KRAS wild-type, BRAF V600E | |
| - Hematology: FLT3, IDH1/2, BCR-ABL | |
| **Lab values (SI units):** | |
| | Field | Unit | Conversion | | |
| |---|---|---| | |
| | Haemoglobin | g/dL | β | | |
| | WBC | Γ10βΉ/L | β | | |
| | ANC | Γ10βΉ/L | β | | |
| | Platelets | Γ10βΉ/L | β | | |
| | Creatinine | **ΞΌmol/L** | auto-converted Γ·88.4 β mg/dL for trial text | | |
| | eGFR | mL/min/1.73mΒ² | β | | |
| | Bilirubin | **ΞΌmol/L** | auto-converted Γ·17.1 β mg/dL for trial text | | |
| | ALT / AST | U/L | β | | |
| Matching score breakdown: | |
| - **Age** 25 pts β compared against trial min/max age | |
| - **Sex** 15 pts β compared against trial sex restriction | |
| - **ECOG** 15 pts β extracted via regex from eligibility criteria text | |
| - **Biomarkers** 30 pts β checks whether biomarker terms appear in trial eligibility text | |
| - **Lab values** 15 pts β parses thresholds from text, converts SI units, checks patient values | |
| Results are ranked by score with pass/fail/uncertain per criterion and direct ClinicalTrials.gov links. | |
| --- | |
| ## Running Locally (no Docker) | |
| ```bash | |
| # 1. Start Neo4j | |
| docker run -d --name neo4j -p 7474:7474 -p 7687:7687 -e NEO4J_AUTH=neo4j/clinicalmatch2024 neo4j:5.18-community | |
| # 2. Backend | |
| cd backend | |
| python -m venv venv && source venv/bin/activate && pip install -r requirements.txt | |
| cp ../.env.example ../.env.local # fill in credentials | |
| uvicorn main:app --reload --port 8000 | |
| # 3. Schema setup (once) | |
| curl -X POST http://localhost:8000/setup | |
| # 4. Seed graph data from live APIs (~15 min, ~250 real trials + 500 patients) | |
| curl -X POST http://localhost:8000/seed | |
| # 5. Frontend | |
| cd frontend | |
| npm install --legacy-peer-deps | |
| npm run dev # http://localhost:3000 (uses --webpack, not Turbopack) | |
| # 6. MCP server (for Prompt Opinion integration) | |
| cd backend | |
| python mcp_server.py | |
| ``` | |
| --- | |
| ## Running with Docker Compose | |
| ```bash | |
| cp .env.example .env.local # fill in OPENAI_API_KEY etc. | |
| docker compose up -d | |
| # Wait ~60s for Neo4j to be healthy, then: | |
| curl -X POST http://localhost:7860/setup | |
| curl -X POST http://localhost:7860/seed | |
| ``` | |
| Services: app β http://localhost:7860 | API docs β http://localhost:7860/api/docs | Neo4j β http://localhost:7474 | |
| --- | |
| ## Deploying to HuggingFace Spaces | |
| 1. Create a Space β **Docker SDK** β blank template | |
| 2. Push repo to the Space: | |
| ```bash | |
| git remote add hf https://huggingface.co/spaces/<username>/<space-name> | |
| git push hf main | |
| ``` | |
| 3. Set **Repository Secrets**: | |
| ``` | |
| OPENAI_API_KEY = <aimlapi.com key> | |
| OPENAI_BASE_URL = https://ai.aimlapi.com/v1 | |
| OPENAI_MODEL = claude-opus-4-7 | |
| NEO4J_PASSWORD = clinicalmatch2024 | |
| ``` | |
| 4. After first boot, seed data: | |
| ``` | |
| POST https://<space>.hf.space/seed | |
| ``` | |
| --- | |
| ## MCP Tools (Prompt Opinion integration) | |
| ```bash | |
| python backend/mcp_server.py # stdio transport | |
| ``` | |
| | Tool | Arguments | Description | | |
| |---|---|---| | |
| | `find_trials` | `condition, phase?` | Real-time trial search | | |
| | `screen_patient` | `patient_id, nct_id` | Eligibility screening | | |
| | `match_patient_to_trials` | `patient_id` | Top-N trial matches | | |
| | `generate_recruitment_outreach` | `patient_id, nct_id, channel` | Personalized outreach | | |
| | `get_trial_analytics` | β | Enrollment funnel + KPIs | | |
| | `summarize_trial_protocol` | `nct_id` | AI-parsed protocol summary | | |
| --- | |
| ## Key API Endpoints | |
| | Method | Path | Description | | |
| |---|---|---| | |
| | POST | `/api/v1/intake/match` | SI-unit intake β ranked trial matches | | |
| | GET | `/api/v1/intake/biomarkers` | Biomarker registry | | |
| | GET | `/api/v1/trials/search` | Real-time CT.gov search (recency-sorted, graph-enriched) | | |
| | GET | `/api/v1/trials/{nct_id}/intelligence` | Graph intelligence per trial | | |
| | GET | `/api/v1/graph/patients` | Query seeded patient IDs from Neo4j | | |
| | POST | `/api/v1/patients/{id}/screen/{nct_id}` | Screen FHIR patient against trial | | |
| | POST | `/api/v1/workflow/run` | Run full A2A pipeline | | |
| | GET | `/api/v1/analytics/kpi` | Dashboard KPIs | | |
| | GET | `/api/v1/map/data` | Site coordinates + patient clusters | | |
| | POST | `/api/v1/graph/query` | GraphRAG natural language query | | |
| | POST | `/seed` | Seed full graph from live APIs | | |
| | GET | `/api/v1/graph/stats` | Node and edge counts | | |
| Full interactive docs: `http://localhost:8000/docs` | |
| --- | |
| ## Environment Variables | |
| | Variable | Description | Default | | |
| |---|---|---| | |
| | `NEO4J_URI` | Neo4j bolt URI | `bolt://localhost:7687` | | |
| | `NEO4J_USERNAME` | Neo4j username | `neo4j` | | |
| | `NEO4J_PASSWORD` | Neo4j password | `clinicalmatch2024` | | |
| | `NEO4J_DATABASE` | Database name | `neo4j` | | |
| | `OPENAI_API_KEY` | aimlapi.com API key | β | | |
| | `OPENAI_BASE_URL` | LLM base URL | `https://ai.aimlapi.com/v1` | | |
| | `OPENAI_MODEL` | Model identifier | `claude-opus-4-7` | | |
| | `NEXT_PUBLIC_API_URL` | Frontend API base URL | `""` (relative, via Nginx) | | |
| --- | |
| ## Frontend Pages | |
| | Route | Page | Description | | |
| |---|---|---| | |
| | `/` | Trial Finder | Real-time CT.gov search, recency-sorted, graph intelligence on expand | | |
| | `/intake` | Eligibility Check | SI-unit clinical intake form, no patient ID required | | |
| | `/screening` | Patient Screening | FHIR patient + trial combobox, A2A pipeline with state tracker | | |
| | `/recruitment` | Recruitment Hub | Kanban board, AI outreach generation (PCP / email / social) | | |
| | `/dashboard` | Dashboard | KPI cards, enrollment funnel, demographics, site performance | | |
| | `/map` | Site Map | Leaflet map of trial sites and patient density clusters | | |
| | `/graph` | GraphRAG | Natural language queries over the knowledge graph | | |