t5-mimic-summary / README.md
Vidit202's picture
Update README.md
1f7f3d0 verified
metadata
library_name: transformers
pipeline_tag: summarization
tags:
  - seq2seq
  - summarization
  - clinical
  - patient-friendly
  - mimic-iv-bhc
language:
  - en
license: apache-2.0
datasets:
  - mimic-iv-bhc
base_model: google/t5-small
model_index:
  - name: Patient-Friendly Clinical Discharge Summarizer (T5-small)
    results:
      - task:
          type: summarization
          name: Abstractive Summarization
        dataset:
          name: MIMIC-IV-BHC (Behavioral Health Care Notes)
          type: physionet
          split: test
        metrics:
          - type: rouge1
            value: 0.2126
          - type: rouge2
            value: 0.0958
          - type: rougeL
            value: 0.1547
          - type: bleu
            value: 0.0042
          - type: bertscore_f1
            value: 0.8339

Model Card for Patient-Friendly Clinical Discharge Summarizer (T5-small)

Model Details

Model Description

Fine-tuned google/t5-small to simplify behavioral health discharge notes (MIMIC-IV-BHC) into patient-friendly summaries. Part of the Patient-Friendly Summarization of Clinical Discharge Notes project; also explored with a RAG variant for added factual grounding.

  • Developed by: Dhyan Patel & Vidit Gandhi
  • Model type: Encoder–decoder transformer (seq2seq)
  • Language(s): English
  • License: Apache-2.0
  • Finetuned from: google/t5-small

Model Sources

  • Repository:
  • Dataset: MIMIC-IV-BHC (PhysioNet)

Uses

Direct Use

  • Generate lay summaries from behavioral health discharge notes.

Downstream Use

  • Embed in EHR/patient portals; pair with RAG to ground definitions and instructions.

Out-of-Scope

  • Automated diagnosis/prescribing or any unsupervised clinical decision-making.

Bias, Risks, and Limitations

  • May omit subtle clinical nuance; behavioral health notes can include sensitive content—human review is required.
  • Risk of hallucinations if fed incomplete context.

How to Get Started

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_id = "your-username/patient-friendly-mimic-iv-bhc-t5-small"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

text = "Paste a BHC discharge note..."
inputs = tok(text, return_tensors="pt", truncation=True, max_length=512)
summary_ids = model.generate(**inputs, max_length=128)
print(tok.decode(summary_ids[0], skip_special_tokens=True))