t5-mimic-summary / README.md
Vidit202's picture
Update README.md
1f7f3d0 verified
---
library_name: transformers
pipeline_tag: summarization
tags:
- seq2seq
- summarization
- clinical
- patient-friendly
- mimic-iv-bhc
language:
- en
license: apache-2.0
datasets:
- mimic-iv-bhc
base_model: google/t5-small
model_index:
- name: Patient-Friendly Clinical Discharge Summarizer (T5-small)
results:
- task:
type: summarization
name: Abstractive Summarization
dataset:
name: MIMIC-IV-BHC (Behavioral Health Care Notes)
type: physionet
split: test
metrics:
- type: rouge1
value: 0.2126
- type: rouge2
value: 0.0958
- type: rougeL
value: 0.1547
- type: bleu
value: 0.0042
- type: bertscore_f1
value: 0.8339
---
# Model Card for Patient-Friendly Clinical Discharge Summarizer (T5-small)
## Model Details
### Model Description
Fine-tuned `google/t5-small` to simplify **behavioral health discharge notes** (MIMIC-IV-BHC) into patient-friendly summaries. Part of the *Patient-Friendly Summarization of Clinical Discharge Notes* project; also explored with a RAG variant for added factual grounding.
- **Developed by:** Dhyan Patel & Vidit Gandhi
- **Model type:** Encoder–decoder transformer (seq2seq)
- **Language(s):** English
- **License:** Apache-2.0
- **Finetuned from:** `google/t5-small`
### Model Sources
- **Repository:** <your model URL>
- **Dataset:** MIMIC-IV-BHC (PhysioNet)
## Uses
### Direct Use
- Generate lay summaries from behavioral health discharge notes.
### Downstream Use
- Embed in EHR/patient portals; pair with RAG to ground definitions and instructions.
### Out-of-Scope
- Automated diagnosis/prescribing or any unsupervised clinical decision-making.
## Bias, Risks, and Limitations
- May omit subtle clinical nuance; behavioral health notes can include sensitive content—human review is required.
- Risk of hallucinations if fed incomplete context.
## How to Get Started
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "your-username/patient-friendly-mimic-iv-bhc-t5-small"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
text = "Paste a BHC discharge note..."
inputs = tok(text, return_tensors="pt", truncation=True, max_length=512)
summary_ids = model.generate(**inputs, max_length=128)
print(tok.decode(summary_ids[0], skip_special_tokens=True))