Spaces:
Running
Running
Commit Β·
b40cc1f
1
Parent(s): 89f66fe
Fix: Neo4j 5.26.0 (APOC available) + correct graphrag schema from seeder
Browse files- Dockerfile +1 -1
- backend/graphrag.py +29 -13
Dockerfile
CHANGED
|
@@ -44,7 +44,7 @@ RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
|
|
| 44 |
&& rm -rf /var/lib/apt/lists/*
|
| 45 |
|
| 46 |
# ββ Neo4j Community 2026.04.0 βββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 47 |
-
ENV NEO4J_VERSION=
|
| 48 |
ENV NEO4J_HOME=/opt/neo4j
|
| 49 |
ENV PATH="${NEO4J_HOME}/bin:${PATH}"
|
| 50 |
|
|
|
|
| 44 |
&& rm -rf /var/lib/apt/lists/*
|
| 45 |
|
| 46 |
# ββ Neo4j Community 2026.04.0 βββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 47 |
+
ENV NEO4J_VERSION=5.26.0
|
| 48 |
ENV NEO4J_HOME=/opt/neo4j
|
| 49 |
ENV PATH="${NEO4J_HOME}/bin:${PATH}"
|
| 50 |
|
backend/graphrag.py
CHANGED
|
@@ -57,25 +57,41 @@ _CYPHER_GENERATION_TEMPLATE = """You are an expert Neo4j Cypher query writer for
|
|
| 57 |
Schema:
|
| 58 |
{schema}
|
| 59 |
|
| 60 |
-
Node
|
| 61 |
-
- Patient: id (e.g. "
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
Relationships:
|
| 69 |
-
- (Patient)-[:ELIGIBLE_FOR {{score: float}}]->(Trial)
|
| 70 |
- (Patient)-[:HAS_DIAGNOSIS]->(Diagnosis)
|
| 71 |
- (Patient)-[:HAS_BIOMARKER]->(Biomarker)
|
| 72 |
-
- (
|
| 73 |
-
- (
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
Rules:
|
| 76 |
-
-
|
| 77 |
-
-
|
| 78 |
-
-
|
|
|
|
|
|
|
|
|
|
| 79 |
- Limit results to 25 unless asked for more
|
| 80 |
|
| 81 |
Question: {question}
|
|
|
|
| 57 |
Schema:
|
| 58 |
{schema}
|
| 59 |
|
| 60 |
+
Node labels and their exact property names:
|
| 61 |
+
- Patient: id (e.g. "P_C50_000001"), name, age (integer), sex ("MALE"/"FEMALE"), ecog (integer 0-3),
|
| 62 |
+
condition (lowercase, e.g. "breast cancer"), stage ("I"/"II"/"III"/"IV"),
|
| 63 |
+
city, state, ethnicity, insurance, icd10_prefix,
|
| 64 |
+
biomarkers (list of biomarker ids), medications (list of drug names),
|
| 65 |
+
comorbidities (list), prior_chemo (boolean), prior_radiation (boolean),
|
| 66 |
+
prior_surgery (boolean), prior_lines_of_therapy (integer), source
|
| 67 |
+
- Trial: id (NCT id, e.g. "NCT04567890"), title, condition (lowercase), phase, status,
|
| 68 |
+
brief_summary, eligibility_criteria, min_age, max_age, sex, enrollment,
|
| 69 |
+
start_date, completion_date, sponsor, location_count, source
|
| 70 |
+
- Diagnosis: code (ICD-10, e.g. "C50.919"), name (e.g. "Malignant neoplasm of breast"), source
|
| 71 |
+
- Biomarker: id (e.g. "HER2_POS"), name (e.g. "HER2 Positive"), gene (e.g. "ERBB2"), loinc, source
|
| 72 |
+
- Medication: rxcui, name, tty, generic_name, source
|
| 73 |
+
- StudySite: facility, city, state, country, lat, lon, source
|
| 74 |
+
- ConditionNode: name (e.g. "breast cancer")
|
| 75 |
+
- Publication: pmid, title, journal, pub_date, authors, source
|
| 76 |
|
| 77 |
Relationships:
|
| 78 |
+
- (Patient)-[:ELIGIBLE_FOR {{score: float, matched_at: datetime}}]->(Trial)
|
| 79 |
- (Patient)-[:HAS_DIAGNOSIS]->(Diagnosis)
|
| 80 |
- (Patient)-[:HAS_BIOMARKER]->(Biomarker)
|
| 81 |
+
- (Trial)-[:CONDUCTED_AT]->(StudySite)
|
| 82 |
+
- (ConditionNode)-[:HAS_TRIAL]->(Trial)
|
| 83 |
+
- (Diagnosis)-[:MAPS_TO_CONDITION]->(ConditionNode)
|
| 84 |
+
- (Biomarker)-[:RELEVANT_TO]->(ConditionNode)
|
| 85 |
+
- (Biomarker)-[:MAY_QUALIFY_FOR]->(Trial)
|
| 86 |
+
- (Publication)-[:SUPPORTS_RESEARCH_ON]->(ConditionNode)
|
| 87 |
|
| 88 |
Rules:
|
| 89 |
+
- Biomarker lookups use the `id` property: `{{id: 'HER2_POS'}}`
|
| 90 |
+
- Diagnosis lookups use `code` (not `id`): `{{code: 'C50.919'}}`
|
| 91 |
+
- Medication lookups use `rxcui` or `name` (not `id`)
|
| 92 |
+
- Condition lookups on Trial nodes use lowercase: `t.condition = 'breast cancer'`
|
| 93 |
+
- Patient-to-trial eligibility: `(p:Patient)-[:ELIGIBLE_FOR]->(t:Trial)`
|
| 94 |
+
- ecog property on Patient is `ecog` (integer), NOT `ecog_score`
|
| 95 |
- Limit results to 25 unless asked for more
|
| 96 |
|
| 97 |
Question: {question}
|