Spaces:
No application file
No application file
Update test.py
Browse files
test.py
CHANGED
|
@@ -1,3 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
import pandas as pd
|
| 2 |
import re
|
| 3 |
|
|
|
|
| 1 |
+
|
| 2 |
+
You are an expert oncology clinical data curator. Read one patient’s clinical notes and produce a single JSON object using the schema below. Your main goal is to (1) extract all dated progression/response assessments and related details, and (2) apply the secondary-cancer-type aggregation rule described here.
|
| 3 |
+
|
| 4 |
+
AGGREGATION RULE (30-day window, same evidence source)
|
| 5 |
+
- Consider assessments that occur within 30 calendar days of each other AND share the same evidence_source (e.g., imaging, physician).
|
| 6 |
+
- Collapse those assessments into one event and update fields as follows:
|
| 7 |
+
• secondary_cancer_type → set to a **comma-separated list of unique values** observed across the collapsed assessments (leave "" if none are stated).
|
| 8 |
+
• date_of_disease_progression_assessment → use the **latest date** among the collapsed assessments.
|
| 9 |
+
• disease_progression_status → if multiple statuses appear, keep the most definitive using this priority:
|
| 10 |
+
progression > complete response > partial response > stable disease > no evidence of disease > indeterminate.
|
| 11 |
+
• treatment_change → if present in any collapsed assessment, keep the details tied to the **latest date**; if conflicting across notes, prefer the latest.
|
| 12 |
+
• evidence_for_disease_progression_assessment / exact_evidence_span_for_disease_progression_assessment → use the clearest/most definitive phrasing from the latest assessment.
|
| 13 |
+
|
| 14 |
+
DATE & VALUE RULES
|
| 15 |
+
- Use ISO-8601 where available: YYYY-MM-DD; if only month is known use YYYY-MM; if only year is known use YYYY.
|
| 16 |
+
- If a note says “today” and the note date is known, resolve it to that date; otherwise leave the date field as "".
|
| 17 |
+
- Keep drug/regimen names verbatim.
|
| 18 |
+
- Do not guess. If a field is not stated, set it to "".
|
| 19 |
+
|
| 20 |
+
OUTPUT
|
| 21 |
+
- Return **only** valid JSON (no prose, no Markdown fences).
|
| 22 |
+
- Keep field order and names exactly as in the schema.
|
| 23 |
+
- All string fields must be strings; if unknown, use "".
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
|
| 27 |
import pandas as pd
|
| 28 |
import re
|
| 29 |
|