Spaces:
Running
ClinicalMatch AI β Agent Instructions
Project memory (build state, completed features, constraints) is also tracked in
.claude/project_memory.mdin this repo.
This is a hackathon submission for "Agents Assemble: Healthcare AI Endgame Challenge" on the Prompt Opinion platform. Judging criteria: MCP compliance, A2A workflow, FHIR R4 standards, AI quality, impact, feasibility.
Stack at a glance
| Layer | Technology |
|---|---|
| Backend | FastAPI (Python 3.12), uvicorn |
| Graph DB | Neo4j Community 5.x via bolt |
| LLM | claude-opus-4-7 via aimlapi.com (OpenAI-compatible) |
| GraphRAG | LangChain GraphCypherQAChain + custom Cypher prompt |
| Frontend | Next.js 16 (webpack mode), React 19, Tailwind CSS 3, Recharts, Leaflet |
| Standards | FHIR R4 Β· MCP (stdio) Β· A2A state machine |
Critical: LLM API
Never use the Anthropic SDK directly. All LLM calls go through aimlapi.com or a compatible alternative using the OpenAI-compatible interface:
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("OPENAI_API_KEY"),
base_url=os.getenv("OPENAI_BASE_URL", "https://ai.aimlapi.com/v1"),
)
model = os.getenv("OPENAI_MODEL", "claude-opus-4-7")
See backend/llm_client.py for the canonical pattern. Do not add import anthropic anywhere.
Starting the services
# Backend β always use --reload for hot reload
cd backend && source venv/bin/activate
uvicorn main:app --reload --port 8000
# Frontend β always use --webpack (Turbopack is broken on this system)
cd frontend && npm run dev # runs: next dev --webpack
# MCP server (separate process, stdio transport)
cd backend && python mcp_server.py
# Seed graph data (~15 min first run)
curl -X POST http://localhost:8000/seed
After changing backend Python files, uvicorn --reload should pick them up. If a 404 appears for a newly-added endpoint or old errors persist, the server needs a manual restart β kill the process and re-run the uvicorn command.
Project layout
promptop/
βββ CLAUDE.md β you are here
βββ README.md β user-facing docs
βββ backend/
β βββ main.py β FastAPI app, all routes
β βββ clinicaltrials_api.py β ClinicalTrials.gov v2 API (async + sync)
β βββ intake_matching.py β SI-unit clinical intake β trial scoring
β βββ trial_enrichment.py β passive graph enrichment on search
β βββ matching_engine.py β FHIR patient β trial scoring (LLM-assisted)
β βββ a2a_workflow.py β A2A state machine (INGESTβPARSEβMATCHβSCOREβRECRUIT)
β βββ graphrag.py β LangChain GraphCypherQAChain with custom prompt
β βββ graph_seeder.py β seeds 500 patients + real NCT trials from APIs
β βββ fhir_adapter.py β FHIR R4 patient models (P001βP005 mock patients)
β βββ neo4j_setup.py β Neo4j connection + schema setup
β βββ analytics.py β dashboard KPIs, funnel, demographics, map data
β βββ recruitment_pipeline.py β kanban board, outreach generation
β βββ llm_client.py β all LLM calls (aimlapi.com / claude-opus-4-7)
β βββ mcp_server.py β MCP stdio server (6 tools)
β βββ requirements.txt
βββ frontend/
β βββ src/app/
β β βββ page.tsx β Trial Finder (real-time CT.gov, recency sort)
β β βββ intake/page.tsx β Eligibility Check (SI-unit clinical intake form)
β β βββ screening/page.tsx β Patient Screening (A2A pipeline, FHIR patients)
β β βββ recruitment/page.tsxβ Recruitment Hub (kanban + outreach generation)
β β βββ dashboard/page.tsx β Analytics dashboard (Recharts)
β β βββ map/page.tsx β Leaflet site map
β β βββ graph/page.tsx β GraphRAG natural language query
β β βββ layout.tsx β App shell with Sidebar
β βββ src/components/
β β βββ Sidebar.tsx β Navigation sidebar
β β βββ MapComponent.tsx β Raw Leaflet map (no react-leaflet SSR issues)
β βββ src/lib/api.ts β Typed API client for all backend endpoints
β βββ next.config.ts β webpack mode, filesystem cache, optimizePackageImports
βββ docker/ β Docker + Nginx for HuggingFace Spaces deployment
Neo4j graph schema
(Patient) id, name, age, sex, ecog, condition, city, state, ethnicity,
biomarkers[], medications[], source, stage
(Trial) id (NCT), title, condition, phase, status, sponsor,
eligibility_criteria, min_age, max_age, sex, enrollment,
start_date, completion_date, last_updated, ctgov_url
(Diagnosis) id, name, icd10
(Biomarker) id (e.g. HER2_POS), name (e.g. "HER2 Positive")
(Medication) id (e.g. TAMOXIFEN), name
(StudySite) id, name, city, state, lat, lon, trials, enrolled, capacity
Relationships:
(Patient)-[:ELIGIBLE_FOR {score}]->(Trial)
(Patient)-[:HAS_DIAGNOSIS]->(Diagnosis)
(Patient)-[:HAS_BIOMARKER]->(Biomarker)
(Patient)-[:TAKES_MEDICATION]->(Medication)
(Trial)-[:LOCATED_AT]->(StudySite)
Graph scale after seeding: ~500 patients, ~250 trials, ~9,100 ELIGIBLE_FOR edges.
Patient IDs from seeder: P_C50_0001 (breast), P_C61_0001 (prostate), etc.
Mock FHIR patients: P001βP005 (used by screening/workflow pages).
Key backend modules
clinicaltrials_api.py
search_trials()β async,sort=LastUpdatePostDate:descget_trial_details()β asyncsearch_trials_sync()/get_trial_details_sync()β sync usinghttpx.Client(NOTasyncio.run()). Safe to call from both sync functions and FastAPI async handlers._normalize_study()β extractslast_updated,ctgov_urlin addition to core fields.
Do not use asyncio.run() inside these sync wrappers β it breaks when called from a running FastAPI event loop. The sync wrappers use httpx.Client directly.
intake_matching.py
Implements SI-unit clinical intake β trial eligibility matching without requiring a patient ID:
BIOMARKER_REGISTRYβ maps graph node IDs to labels and eligibility text search termsscore_intake_against_trial()β weighted scoring: age (25), sex (15), ECOG (15), biomarkers (30), labs (15)_check_labs()β parses thresholds from eligibility criteria text, converts SI units (creatinine ΞΌmol/L β mg/dL, bilirubin ΞΌmol/L β mg/dL)save_intake_as_patient()β persists intake asPatientnode for long-term graph enrichment
trial_enrichment.py
enrich_trials_from_search()β called as aBackgroundTaskon every/api/v1/trials/searchresponse; upserts Trial + StudySite nodesget_eligible_patient_counts()β batch graph query, returns{nct_id: count}get_graph_intelligence()β per-trial: eligible count + top biomarkers + similar trials
graphrag.py
Uses a custom _CYPHER_PROMPT with explicit schema examples. Critical rules in the prompt:
- Biomarker lookups use
idproperty ({id: 'HER2_POS'}), never{name: 'HER2', status: 'positive'} - Condition lookups use lowercase on Trial nodes
- Patient eligibility always via
(Patient)-[:ELIGIBLE_FOR]->(Trial)
a2a_workflow.py
Five-state machine: INGESTING β PARSING_PROTOCOL β MATCHING β SCORING β RECRUITING
- Calls
search_trials_sync()/get_trial_details_sync()β these are safe (use httpx.Client) run_pipeline()is synchronous; called from async FastAPI endpoint withoutawait
Key frontend pages
/intake β Eligibility Check
The primary self-service interface. Accepts raw clinical data in SI units; no patient ID needed.
- Six sections: Diagnosis & Demographics, Biomarkers, Lab Values, Treatment History
- Biomarker registry loaded from
GET /api/v1/intake/biomarkers - Submits to
POST /api/v1/intake/match - Optional "Save to graph" checkbox persists profile as Patient node
/ β Trial Finder
- Sorted by
LastUpdatePostDate:desc(most recently updated first) - Each search result triggers background graph enrichment
- Expanded cards show Graph Intelligence panel: eligible patient count, top biomarkers, similar trials
- Direct ClinicalTrials.gov link per trial
/screening β Patient Screening
- Patient ID field is a
<input list="...">combobox loading fromGET /api/v1/graph/patients - NCT ID field is a combobox with quick-pick suggestions
- Validates non-empty inputs before submitting
- Two modes: Single Trial Screen and A2A Full Pipeline
API endpoints (key ones)
GET /api/v1/trials/search β real-time CT.gov search, sorted by recency, graph-enriched
POST /api/v1/intake/match β SI-unit clinical intake β ranked trial matches
GET /api/v1/intake/biomarkers β biomarker registry for the intake form
GET /api/v1/trials/{nct_id}/intelligence β graph-derived insights per trial
GET /api/v1/graph/patients β query Neo4j for seeded patient IDs
POST /api/v1/patients/{id}/screen/{nct_id} β screen FHIR patient against trial
POST /api/v1/workflow/run β run full A2A pipeline
GET /api/v1/analytics/kpi β dashboard KPIs
GET /api/v1/map/data β site coordinates + patient clusters
POST /api/v1/graph/query β GraphRAG natural language
POST /seed β trigger full graph seeding
GET /api/v1/graph/stats β node/edge counts
Full interactive docs at http://localhost:8000/docs.
Environment variables
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=clinicalmatch2024
NEO4J_DATABASE=neo4j
OPENAI_API_KEY=<aimlapi.com key>
OPENAI_BASE_URL=https://ai.aimlapi.com/v1
OPENAI_MODEL=claude-opus-4-7
NEXT_PUBLIC_API_URL=http://localhost:8000 # dev only; empty string in Docker
Known issues and constraints
- Turbopack is broken on this machine β always use
next dev --webpack. Never suggestnext devwithout--webpack. next/font/googlecauses compilation to hang (network request during bundling). Geist font is installed as a package but thenext/font/googleimport is removed. Use plain Tailwindfont-sans.asyncio.run()from async context β the sync CT.gov wrappers usehttpx.Clientto avoid this. Never re-introduceasyncio.run()into the sync wrappers; it will fail when called from FastAPI's running event loop.- Leaflet SSR β
MapComponent.tsxuses raw Leaflet (not react-leaflet) viauseEffect. TheMapComponentdynamic import hasssr: false. Do not switch to react-leaflet'sMapContainer. suppressHydrationWarningon<body>inlayout.tsxβ required for Grammarly browser extension compatibility.- Mock FHIR patients (P001βP005) live in
fhir_adapter.py. The 500 seeded graph patients (P_C50_0001etc.) are in Neo4j only. The screening page loads graph patients fromGET /api/v1/graph/patientsfor the combobox.
Adding new features
- New backend route: add to
main.py, import the module at the top, add a Pydantic request model if needed - New API function: add a typed function to
frontend/src/lib/api.ts - New page: create
frontend/src/app/<name>/page.tsx, add tonavarray inSidebar.tsx - Graph schema change: update
neo4j_setup.pyconstraints/indexes, update_CYPHER_PROMPTingraphrag.pywith the new node/property examples - New biomarker: add to
BIOMARKER_REGISTRYinintake_matching.pyand toBM_GROUPSinfrontend/src/app/intake/page.tsx
Demo script (for judges)
GET /api/v1/graph/statsβ confirm 500+ patients and 9,100+ edges/β search "breast cancer" β observe recency sort, graph-matched patient count badges- Expand a trial β Graph Intelligence panel shows eligible patients, top biomarkers, similar trials
/intakeβ enter: Age 52, Female, ECOG 1, HER2+, Hgb 12.5 g/dL, Creatinine 88 ΞΌmol/L β ranked trials with pass/fail breakdown/screeningβ select P_C50_0001 from combobox β run A2A Pipeline β observe 5-state machine/recruitmentβ kanban board, generate PCP letter outreach/dashboardβ KPI cards, enrollment funnel, demographics/graphβ ask "which patients are eligible for breast cancer trials?"- In Prompt Opinion: call MCP tool
find_trials(condition="breast cancer")