File size: 24,335 Bytes
c7a6fe6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 | {
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "e1eea264",
"metadata": {},
"outputs": [],
"source": [
"def training_prompt(medical_text, subclaims):\n",
" system_prompt = f\"\"\"\n",
"You are an expert medical annotator. Your task is to extract granular, factual subclaims from medical text.\n",
"A subclaim is the smallest standalone factual unit that can be independently verified.\n",
"\n",
"Instructions:\n",
"1. Read the provided medical text.\n",
"2. Break it into clear, objective subclaims.\n",
"3. Each subclaim must be directly derived from the text.\n",
"4. Do not add, guess, infer, or combine multiple facts.\n",
"5. Each subclaim should be short, specific, and verifiable.\n",
"\n",
"Medical Text:\n",
"{medical_text}\n",
"\"\"\"\n",
"\n",
" conversation = {}\n",
" conversation['conversations'] = (\n",
" {'from': \"user\", 'content': system_prompt},\n",
" {'from': \"assistant\", 'content': str(subclaims)},\n",
" )\n",
" return conversation\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "72fbae33",
"metadata": {},
"outputs": [],
"source": [
"# /home/mshahidul/readctrl/data/finetuning_data/finetune_dataset_extract-subclaim.json read\n",
"with open('/home/mshahidul/readctrl/data/finetuning_data/finetune_dataset_extract-subclaim.json', 'r') as f:\n",
" import json\n",
" data = json.load(f)\n",
"prompts = []\n",
"for item in data:\n",
" medical_text = item['medical_text']\n",
" subclaims = item['subclaims']\n",
" prompt = training_prompt(medical_text, subclaims)\n",
" prompts.append(prompt)\n",
"with open('/home/mshahidul/readctrl/data/finetuning_data/finetune_dataset_extract-subclaim_conversation.json', 'w') as f:\n",
" json.dump(prompts, f, indent=2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "118e5fce",
"metadata": {},
"outputs": [],
"source": [
"# python /home/mshahidul/readctrl/code/finetune-inference/completeness_reasoning_v3.py --data_path /home/mshahidul/readctrl/data/concise_complete_attr_cal_v3/evaluated_metrics_0_100.json \n",
"import os\n",
"for x in os.listdir('/home/mshahidul/readctrl/data/concise_complete_attr_cal_v3/'):\n",
" if x.endswith('.json'):\n",
" dat=f'python /home/mshahidul/readctrl/code/finetune-inference/completeness_reasoning_v3.py --data_path /home/mshahidul/readctrl/data/concise_complete_attr_cal_v3/{x}'\n",
" print(dat) \n",
" print('\\n')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb108a11",
"metadata": {},
"outputs": [],
"source": [
"import zipfile\n",
"\n",
"# /home/mshahidul/readctrl/data/testing_data/multiclinsum_test_es.zip\n",
"with zipfile.ZipFile('/home/mshahidul/readctrl/data/testing_data/multiclinsum_test_es.zip', 'r') as zip_ref:\n",
" zip_ref.extractall('/home/mshahidul/readctrl/data/testing_data/es_data/')\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6ea249db",
"metadata": {},
"outputs": [],
"source": [
"def training_prompt(text, subclaim, label):\n",
" system_prompt = f\"\"\"\n",
"You are a medical evidence evaluator.\n",
"\n",
"Your task is to determine the relationship between a medical text and a subclaim.\n",
"\n",
"Definitions:\n",
"- 1 = supported (the text directly supports the subclaim)\n",
"- 0 = refuted (the text contradicts the subclaim)\n",
"- 2 = not_supported (the text is related but provides no evidence for the subclaim)\n",
"\n",
"Medical Text:\n",
"{text}\n",
"\n",
"Subclaim:\n",
"{subclaim}\n",
"\n",
"Respond ONLY with a single number: 1, 0, or 2.\n",
"\"\"\"\n",
"\n",
" conversation = {}\n",
" conversation['conversations'] = (\n",
" {'from': \"user\", 'content': system_prompt},\n",
" {'from': \"assistant\", 'content': str(label)},\n",
" )\n",
" return conversation\n",
"# /home/mshahidul/readctrl/data/finetuning_data/processed_subclaim_support_data.json\n",
"with open('/home/mshahidul/readctrl/data/finetuning_data/processed_subclaim_support_data.json', 'r') as f:\n",
" import json\n",
" data = json.load(f)\n",
"prompts = []\n",
"for item in data:\n",
" text = item['text']\n",
" subclaim = item['subclaim']\n",
" label = item['label']\n",
" prompt = training_prompt(text, subclaim, label)\n",
" prompts.append(prompt)\n",
"with open('/home/mshahidul/readctrl/data/finetuning_data/processed_subclaim_support_data_conversation.json', 'w') as f:\n",
" json.dump(prompts, f, indent=2)"
]
},
{
"cell_type": "markdown",
"id": "fcc9cec9",
"metadata": {},
"source": [
"## classifier design for readability test"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "6a5690f1",
"metadata": {},
"outputs": [],
"source": [
"def readability_training_prompt_with_human(full_text, generated_text, human_score):\n",
" \"\"\"\n",
" Modified training prompt: Evaluates readability by comparing \n",
" generated text against the original source (Full Text) only.\n",
" \"\"\"\n",
" \n",
" system_prompt = f\"\"\"You are a medical readability evaluator.\n",
"\n",
"### Task\n",
"Compare the \"GENERATED TEXT\" against the \"FULL TEXT\" to determine its readability for a general, non-medical audience.\n",
"\n",
"### Input Data\n",
"- **FULL TEXT:** {full_text}\n",
"- **GENERATED TEXT (Evaluate this):** {generated_text}\n",
"\n",
"### Readability Scale\n",
"1: Very Easy - Minimal medical language, uses simple terms.\n",
"2: Easy - Accessible to most, minor jargon explained.\n",
"3: Medium - Some technical terms, moderate complexity.\n",
"4: Hard - Clinical tone, assumes some prior knowledge.\n",
"5: Very Hard - Extremely technical, requires medical expertise.\n",
"\n",
"### Constraints\n",
"- Evaluate ONLY the \"GENERATED TEXT\".\n",
"- Use \"FULL TEXT\" only for context of the subject matter.\n",
"- Do NOT assess factual accuracy.\n",
"\n",
"### Output Format\n",
"Return ONLY the following JSON object:\n",
"{{\n",
" \"readability_score\": {human_score}\n",
"}}\"\"\"\n",
"\n",
" # Structured for standard SFT (Supervised Fine-Tuning) formats\n",
" conversation = {\n",
" \"conversations\": [\n",
" {\"role\": \"user\", \"content\": system_prompt},\n",
" {\"role\": \"assistant\", \"content\": f\"{{\\\"readability_score\\\": {human_score}}}\"}\n",
" ]\n",
" }\n",
" \n",
" return conversation"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "63b469ef",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dict_keys(['low_health_literacy', 'intermediate_health_literacy', 'proficient_health_literacy'])\n"
]
}
],
"source": [
"# /home/mshahidul/readctrl/data/annotators_validate_data/Sharmin Sultana_2025-12-31_14-19-30/annotation_results.json\n",
"with open('/home/mshahidul/readctrl/data/synthetic_dataset_diff_labels/syn_data_diff_labels_en_v1.json', 'r') as f:\n",
" import json\n",
" anno_data = json.load(f)\n",
"print(anno_data[0]['diff_label_texts'].keys())"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ea10b2cb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Merge Complete.\n",
"Original keys preserved: ['index', 'fulltext', 'diff_label_texts', 'summary']\n",
"Sample 'diff_label_texts' keys check: dict_keys(['low_health_literacy', 'intermediate_health_literacy', 'proficient_health_literacy'])\n"
]
}
],
"source": [
"import json\n",
"import pandas as pd\n",
"\n",
"# Define file paths\n",
"gs_path = '/home/mshahidul/readctrl/data/testing_data_gs/multiclinsum_gs_train_en.json'\n",
"syn_path = '/home/mshahidul/readctrl/data/synthetic_dataset_diff_labels/syn_data_diff_labels_en_v1.json'\n",
"output_path = '/home/mshahidul/readctrl/data/synthetic_dataset_diff_labels/syn_data_with_gs_summary_en.json'\n",
"\n",
"# 1. Load Ground Truth Data\n",
"with open(gs_path, 'r', encoding='utf-8') as f:\n",
" gs_data = json.load(f)\n",
"\n",
"# 2. Load Synthetic Data (Preserving all keys: index, fulltext, diff_label_texts)\n",
"with open(syn_path, 'r', encoding='utf-8') as f:\n",
" syn_data = json.load(f)\n",
"\n",
"# Convert to DataFrames\n",
"# We only need 'fulltext' and 'summary' from the GS file for the mapping\n",
"df_gs = pd.DataFrame(gs_data)[['fulltext', 'summary']]\n",
"df_gs = df_gs.drop_duplicates(subset=['fulltext'])\n",
"\n",
"# Create the Synthetic DataFrame (contains index, fulltext, diff_label_texts)\n",
"df_syn = pd.DataFrame(syn_data)\n",
"\n",
"# 3. Perform Left Join\n",
"# This keeps every column in df_syn and adds 'summary' where fulltext matches\n",
"merged_df = pd.merge(df_syn, df_gs, on='fulltext', how='left')\n",
"\n",
"# 4. Save and Verify\n",
"merged_data = merged_df.to_dict(orient='records')\n",
"\n",
"with open(output_path, 'w', encoding='utf-8') as f:\n",
" json.dump(merged_data, f, indent=4, ensure_ascii=False)\n",
"\n",
"print(f\"Merge Complete.\")\n",
"print(f\"Original keys preserved: {list(merged_df.columns)}\")\n",
"print(f\"Sample 'diff_label_texts' keys check: {merged_df.iloc[0]['diff_label_texts'].keys()}\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "1b3c848f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dict_keys(['id', 'fulltext', 'summary'])\n"
]
}
],
"source": [
"# /home/mshahidul/readctrl/data/testing_data_gs/multiclinsum_gs_train_en.json\n",
"with open('/home/mshahidul/readctrl/data/testing_data_gs/multiclinsum_gs_train_en.json', 'r') as f:\n",
" import json\n",
" _data = json.load(f)\n",
"print(_data[0].keys())\n",
"a_dict = {}\n",
"for item in _data:\n",
" a_dict[item['fulltext']] = item['summary']"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "bb68d61b",
"metadata": {},
"outputs": [],
"source": [
"# /home/mshahidul/readctrl/data/data_annotator_data/vector_db_all-miniLM/crowdsourcing_input_en_v2.json\n",
"with open('/home/mshahidul/readctrl/data/synthetic_dataset_diff_labels/syn_data_diff_labels_en_v1.json', 'r') as f:\n",
" import json\n",
" gen_data = json.load(f)\n",
"data={}\n",
"for item in gen_data:\n",
" for label in list(item['diff_label_texts'].keys()):\n",
" # print(item.keys())\n",
" data.setdefault(item['index'], {})[label] = {\n",
" 'fulltext': item['fulltext'],\n",
" # 'gold_summary': a_dict[item['fulltext']],\n",
" 'generated_text': item['diff_label_texts'][label]\n",
" }\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7fd3115c",
"metadata": {},
"outputs": [],
"source": [
"import math\n",
"\n",
"def convert_score(score: int) -> int:\n",
" if not 1 <= score <= 10:\n",
" raise ValueError(\"Score must be between 1 and 10\")\n",
" return math.ceil(score / 2)\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "36ebb028",
"metadata": {},
"outputs": [],
"source": [
"full_data=[]\n",
"for item in anno_data:\n",
" label=item['health_literacy_label']\n",
" full_text = data[item['doc_id']][label]['fulltext']\n",
" # gold_summary = data[item['doc_id']][label]['gold_summary']\n",
" generated_text = data[item['doc_id']][label]['generated_text']\n",
" human_score = convert_score(item['doc_rating'])\n",
" res=readability_training_prompt_with_human(full_text,generated_text,human_score)\n",
" full_data.append(res)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "8b8df130",
"metadata": {},
"outputs": [],
"source": [
"with open(\"/home/mshahidul/readctrl/data/finetuning_data/classifier_en_data.json\", \"w\") as f:\n",
" json.dump(full_data, f, indent=4)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "3dfb6a3c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'conversations': [{'role': 'user',\n",
" 'content': 'You are a medical readability evaluator.\\n\\n### Task\\nCompare the \"GENERATED TEXT\" against the \"FULL TEXT\" to determine its readability for a general, non-medical audience.\\n\\n### Input Data\\n- **FULL TEXT:** The patient was a 59-year-old Japanese man with a 28-year history of type 1 diabetes. He visited our hospital monthly for management of diabetes with intensive therapy employing multiple-dose insulin injections. His height and body weight were 168 cm and 52 kg (body mass index: 18.4 kg/m2), respectively. He showed depleted insulin secretion (serum C-peptide level was below the limit of detection), such that his blood glucose levels fluctuated severely, and his hemoglobin A1c (HbA1c) level was around 9.0% despite intensive insulin therapy. He had been diagnosed with asymptomatic chronic severe (grade III) aortic regurgitation (AR) 16 years before the current presentation but had declined follow-up for the AR. He had never undergone surgery nor the implantation of any prosthetic devices.\\n\\nEight days after his regular hospital visit, he visited an emergency clinic complaining of breathing difficulty and had a fever above 38℃. Until that day, he had not noticed any fever, chills, weakness, or any other symptoms. His blood pressure and pulse rate were 192/82 mmHg and 118/min, respectively. He showed orthopnea, and his oxygen saturation (SpO2) was 80%. He was transported to the emergency department of our hospital. A physical examination revealed a Levine 3/6 systolic murmur, although his cardiac murmur had not been checked at regular hospital visits. No physical findings suggesting IE, such as Osler nodes, Janeway lesions, or conjunctival petechiae, were recognized. His white blood cell (WBC) count was markedly increased to 20,800 /μL, and his C-reactive protein (CRP) was elevated to 6.06 mg/dL. Serum creatine phosphokinase MB was within the normal range, at 6.0 IU/L, and troponin T was negative. Chest X-ray showed pulmonary congestion with cardiac enlargement (cardiothoracic ratio: 55%). Electrocardiography revealed ST elevation on V1-V4, but emergency echocardiography showed no dysfunction of cardiac contractility. He was diagnosed with acute heart failure due to valvular disease, and treatment with non-invasive positive pressure ventilation and nitrates was initiated.\\n\\nAfter hospital admission, a detailed examination by transthoracic echocardiography showed severe aortic regurgitation, severe mitral regurgitation, and a mobile vegetation on the mitral valve. Transesophageal echocardiography revealed a 16.5×6-mm mobile vegetation on the anterior leaflet of the mitral valve and an 11.2×5-mm nonmobile vegetation on the noncoronary cusp of the aortic valve. These findings raised strong suspicion of NVE. In this case, head computed tomography (CT) and magnetic resonance imaging revealed no cerebral infarction or hemorrhaging, although a mobile vegetation was detected.\\n\\nOn reviewing the clinical course until hospitalization, we noted that at the visit four months before admission, his WBC count had been slightly elevated. The following month, his albumin (Alb) level decreased to 3.0 g/dL, and his hemoglobin (Hb) level had shown a gradual decline over the 2 months prior to admission. During this period, he had experienced a 4-kg weight loss. Esophagogastroduodenoscopy and whole-body CT were performed, but no abnormalities were detected. One month later, he had regained some weight, and the laboratory findings had nearly normalized, except for a slightly elevated CRP level (0.54 mg/dL). At the last visit (8 days before admission), his WBC count had again risen to 9,300 /μL, while his Hb and Alb levels had again decreased to 13.1 g/dL and 3.0 g/dL, respectively. Furthermore, his CRP level had increased to 4.18 mg/dL. At that time, his diastolic blood pressure has shown an obvious decrease. Thus far, he had not experienced a fever or any symptoms other than weight loss. We suspected diseases of infectious and/or malignant origin and initiated comprehensive examinations to identify the source of his clinical findings.\\n\\nAfter heart failure treatment had been started, his clinical symptoms showed rapid improvement, and his hemodynamic stability was maintained during the first six hours. He initially received empirical intravenous antibiotic therapy consisting of 12 g/day of ampicillin sulbactam (ABPC/S) and 120 mg/day of gentamycin (GM). Three blood culture sets were obtained on the admission, and all were positive for S. warneri [minimum inhibitory concentration (MIC) to ABPC/S ≤8 μg/mL; MIC to GM ≤1 μg/mL; MIC to cefazolin (CEZ) ≤2 μg/mL]. Thus, IE caused by this organism was diagnosed.\\n\\nAccording to the clinical guideline established by the Japanese Circulation Society, emergency surgery is generally recommended for heart failure of NYHA III to IV or urgent surgery for NVE mobile vegetation exceeding 10 mm and severe valve dysfunction. In this case, however, his heart failure was successfully improved. Based on the guideline, the risk of embolism was considered to have been reduced by the administration of appropriate antibiotic therapy. In addition, the patient had type 1 diabetes, and his glycemic control was so poor that we were concerned that double-valve surgery would be a high-risk procedure. Therefore, we planned elective surgery after sufficient control of both infection and diabetes.\\n\\nBased on the blood culture results, the antibiotic regimen was switched to 6 g/day of CEZ. A detailed dental examination revealed no abnormalities, such as periodontitis. After four weeks of antibiotic therapy, he underwent surgical therapy. His aortic valve was found to be bicuspid, and the aortic and mitral annuli were intact without abscess formation. Large vegetations were exenterated, and the mitral and aortic valves were both replaced with mechanical valves. He experienced no postoperative complications and was discharged on the 22nd day after the operation without apparent embolism. He has not had any recurrence in over two years since the operation.\\n- **GENERATED TEXT (Evaluate this):** A 59-year-old Japanese man with a 28-year history of type 1 diabetes on intensive multiple-dose insulin therapy (BMI 18.4 kg/m2, undetectable C‑peptide, HbA1c ~9.0%) and remote, asymptomatic chronic severe (grade III) aortic regurgitation (diagnosed 16 years earlier without subsequent follow‑up) presented with acute decompensated heart failure. He had never undergone surgery or prosthetic device implantation and had no history of immunosuppressive therapies.\\n\\nEight days after a routine visit, he developed dyspnea and fever >38℃. On arrival: BP 192/82 mmHg, HR 118/min, orthopnea, SpO2 80%. Exam: Levine 3/6 systolic murmur; no Osler nodes, Janeway lesions, or conjunctival petechiae. Labs: WBC 20,800/μL, CRP 6.06 mg/dL, CK‑MB 6.0 IU/L, troponin T negative. CXR showed pulmonary congestion with cardiomegaly (CTR 55%). ECG had ST elevation in V1–V4, but emergent echocardiography showed no systolic dysfunction. He was diagnosed with acute heart failure due to valvular disease and treated with non‑invasive positive pressure ventilation and nitrates.\\n\\nTransthoracic echocardiography demonstrated severe aortic regurgitation and severe mitral regurgitation with a mobile mitral vegetation. Transesophageal echocardiography identified a 16.5×6‑mm mobile vegetation on the anterior leaflet of the mitral valve and an 11.2×5‑mm nonmobile vegetation on the noncoronary cusp of the aortic valve, raising strong suspicion for native valve endocarditis (NVE). Head CT and MRI showed no cerebral infarction or hemorrhage.\\n\\nRetrospective review revealed subtle abnormalities starting four months pre‑admission: mildly elevated WBC, albumin decreased to 3.0 g/dL the following month, and gradual hemoglobin decline over two months, with a 4‑kg weight loss. EGD and whole‑body CT were unrevealing. He partially regained weight and labs nearly normalized except for a CRP of 0.54 mg/dL. At the last pre‑admission visit (8 days prior), WBC was 9,300/μL, Hb 13.1 g/dL, Alb 3.0 g/dL, CRP 4.18 mg/dL, and diastolic BP had fallen; he remained afebrile and asymptomatic aside from weight loss.\\n\\nEmpiric antibiotics were initiated with ampicillin–sulbactam 12 g/day plus gentamicin 120 mg/day. Three admission blood culture sets all grew Staphylococcus warneri, a coagulase‑negative staphylococcus (CoNS) and resident skin flora (MICs: ABPC/S ≤8 μg/mL; GM ≤1 μg/mL; CEZ ≤2 μg/mL), confirming S. warneri IE. Per Japanese Circulation Society guidance, emergency surgery is generally recommended for NYHA III–IV heart failure or urgent surgery for NVE with mobile vegetation >10 mm and severe valve dysfunction. Because heart failure improved rapidly and appropriate antibiotics were started (reducing embolic risk), and given poorly controlled type 1 diabetes increasing operative risk, elective surgery was planned after stabilization of infection and glycemia. Antibiotics were narrowed to cefazolin 6 g/day; dental evaluation showed no periodontitis.\\n\\nAfter four weeks of antibiotics, surgery revealed a bicuspid aortic valve with intact aortic and mitral annuli and no abscess. Large vegetations were exenterated, and both valves were replaced with mechanical prostheses. The postoperative course was uneventful; he was discharged on postoperative day 22 without apparent embolism and has remained recurrence‑free for over two years. This case represents NVE due to the resident CoNS S. warneri in a patient without prosthetic material or immunosuppression, with prodromal laboratory abnormalities and weight loss evident up to four months before presentation.\\n\\n### Readability Scale\\n1: Very Easy - Minimal medical language, uses simple terms.\\n2: Easy - Accessible to most, minor jargon explained.\\n3: Medium - Some technical terms, moderate complexity.\\n4: Hard - Clinical tone, assumes some prior knowledge.\\n5: Very Hard - Extremely technical, requires medical expertise.\\n\\n### Constraints\\n- Evaluate ONLY the \"GENERATED TEXT\".\\n- Use \"FULL TEXT\" only for context of the subject matter.\\n- Do NOT assess factual accuracy.\\n\\n### Output Format\\nReturn ONLY the following JSON object:\\n{\\n \"readability_score\": 5\\n}'},\n",
" {'role': 'assistant', 'content': '{\"readability_score\": 5}'}]}"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_data[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6dfc340b",
"metadata": {},
"outputs": [],
"source": [
"dict_keys(['queue_position', 'doc_id', 'health_literacy_label', 'wiki_id', 'doc_snippet', 'wiki_snippet', 'doc_rating', 'wiki_rating', 'is_duplicate', 'timestamp'])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "un",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.14"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
|