Leveraging Large Language Models to Predict Unplanned ICU Readmissions from Electronic Health Records
This repository contains the fine-tuned Large Language Models (LLMs) developed for predicting unplanned Intensive Care Unit (ICU) readmissions from Electronic Health Records (EHRs).
The models were introduced in the paper:
Leveraging Large Language Models to Predict Unplanned ICU Readmissions from Electronic Health Records
Hoda Helmy, Ahmed Ibrahim, Maryam Arabi, Aamenah Sattar, and Ahmed Serag
Natural Language Processing Journal (2025)
Paper: https://doi.org/10.1016/j.nlp.2025.100182
Abstract
Unplanned ICU readmissions are associated with increased mortality, prolonged hospital stays, and higher healthcare costs. Early identification of high-risk patients can support clinical decision-making and improve patient outcomes.
This work investigates the use of Large Language Models (LLMs) for ICU readmission prediction by transforming structured Electronic Health Records (EHRs) into a textual representation suitable for language model processing. We evaluate both explicit classification and implicit text-generation approaches using fine-tuned Gemma 2B and Apollo 2B models.
The proposed framework not only predicts readmission risk but also generates interpretable clinical explanations that can assist healthcare professionals in understanding the reasoning behind model predictions.
Dataset
This work utilizes the publicly available eICU Collaborative Research Database (eICU-CRD).
Dataset Access:
https://eicu-crd.mit.edu/gettingstarted/access/
Please ensure compliance with all data usage agreements before accessing or using the dataset.
Loading the Model
PEFT / LoRA Models
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftConfig, PeftModel
peft_model_id = "serag-ai/ICU-APOLLO-EXP"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(
config.base_model_name_or_path
)
model = PeftModel.from_pretrained(
model,
peft_model_id
)
tokenizer = AutoTokenizer.from_pretrained(
config.base_model_name_or_path
)
Example Usage
prompt = """
Patient Summary:
Age: 72
Gender: Male
Heart Rate: 118 bpm
White Blood Cell Count: Elevated
Mechanical Ventilation: Yes
Predict ICU readmission risk.
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=128
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Citation
If you use this work in your research, please cite:
@article{HELMY2025100182,
title = {Leveraging large language models to predict unplanned ICU readmissions from electronic health records},
journal = {Natural Language Processing Journal},
volume = {13},
pages = {100182},
year = {2025},
issn = {2949-7191},
doi = {10.1016/j.nlp.2025.100182},
url = {https://www.sciencedirect.com/science/article/pii/S2949719125000585},
author = {Hoda Helmy and Ahmed Ibrahim and Maryam Arabi and Aamenah Sattar and Ahmed Serag}
}
- Downloads last month
- 26
Model tree for serag-ai/ICU-APOLLO-EXP
Base model
google/gemma-2b-it