row56
/

ProtoPatient

@@ -27,43 +27,131 @@ model-index:
 ---
 # ProtoPatient Model for Multi-Label Classification
-This repository contains the implementation of the ProtoPatient model, designed for multi-label classification tasks. It is based on a transformer model and integrates a prototype-based learning approach.
-ProtoPatient is a novel method using prototypical networks and label-wise attention for diagnosis prediction from clinical text, ensuring both accuracy and interpretability by making predictions based on text segments similar to prototypical patients. Evaluated on two clinical datasets, it outperforms existing baselines and provides explanations that are understandable and valuable for doctors.
-## **Repository Structure**
-```
 ProtoPatient/
-│── proto_model/
 │   ├── proto.py
 │   ├── utils.py
 │   ├── metrics.py
 │   ├── __init__.py
-│── config.json
-│── model.safetensors
-│── tokenizer.json
-│── tokenizer_config.json
-│── vocab.txt
-│── README.md
-│── .gitattributes
-```
-## **How to Use the Model**
-### **1. Install Dependencies**
-Ensure you have `transformers` and `torch` installed:
-```bash
 pip install transformers torch
-```
-### **2. Load the Model**
-You can load the model as follows:
-```python
-from transformers import AutoModel
-model = AutoModel.from_pretrained("row56/ProtoPatient")
-print("✅ Model with weights loaded successfully!")
-```

 ---
 # ProtoPatient Model for Multi-Label Classification
+## Paper Reference:
+van Aken, Betty, Jens-Michalis Papaioannou, Marcel G. Naik, Georgios Eleftheriadis, Wolfgang Nejdl, Felix A. Gers, and Alexander Löser. 2022.
+"This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text."
+(arXiv:2210.08500)
+ProtoPatient is a transformer-based architecture that uses prototypical networks and label-wise attention to provide multi-label classification on clinical admission notes. Unlike standard black-box models, ProtoPatient offers inherent interpretability:
+- It highlights the most relevant tokens for each possible diagnosis.
+- It retrieves prototypical patients from the training set who exhibit similar textual patterns, providing intuitive justifications to clinicians: “This patient looks like that patient.”
+## Model Overview
+### Prototype-Based Classification
+#### Prototypical Vectors
+- The model learns prototypical vectors (uc) for each prototypical diagnosis c.
+#### Diagnosis-specific Representation
+- A patient’s admission note is mapped (via a PubMedBERT encoder plus a linear compression layer) to a diagnosis-specific representation (), generated by a label-wise attention mechanism.
+#### Classification Scores
+- Classification scores are computed via the negative Euclidean distance between  and , yielding a direct measure of “this note’s similarity to the learned prototype.”
+### Label-Wise Attention
+- For each diagnosis, a separate attention vector identifies relevant tokens in the admission note.
+- This yields interpretability: the most “attended-to” tokens are presumably the evidence driving each diagnosis prediction.
+### Interpretable Output
+#### Token highlights:
+- The top attended words, which often correlate with symptoms, risk factors, or diagnostic descriptors.
+#### Prototypical Patients:
+- The training examples closest to each prototype, exemplifying typical presentations of a diagnosis.
+## Key Features and Benefits
+#### Improved Performance on Rare Diagnoses:
+- ProtoPatient leverages prototype-based learning, which has shown strong few-shot behavior, especially beneficial for diagnoses with very few samples.
+#### Faithful Interpretations:
+- A quantitative study (see paper, Section 5) shows that ProtoPatient’s attention-based highlights are more faithful to the model’s true decision process compared to post-hoc explainers (like Lime, Occlusion, or Grad-based methods).
+#### Clinical Utility:
+- Offers label-wise explanations to help clinicians quickly assess whether the system’s reasoning aligns with actual risk factors.
+- Points out prototypical patients, allowing doctors to compare and contrast new admissions with typical (or atypical) presentations.
+## Performance Metrics
+Evaluated on MIMIC-III (48,745 admission notes, 1,266 diagnosis labels):
+- Macro ROC AUC: ~87–88%
+- Micro ROC AUC: ~97%
+- Macro PR AUC: ~18–21%
+Performance gains are particularly strong for rare diagnoses (fewer than 50 samples) compared to baselines such as PubMedBERT alone or hierarchical attention RNNs (HAN, HA-GRU).
+Additionally tested on i2b2 data (1,118 admission notes), achieving high transferability across different clinical environments.
+(Refer to Tables 1, 2, and 3 in the paper for detailed results and ablation studies.)
+## Repository Structure
 ProtoPatient/
+├── proto_model/
 │   ├── proto.py
 │   ├── utils.py
 │   ├── metrics.py
 │   ├── __init__.py
+├── config.json
+├── model.safetensors
+├── tokenizer.json
+├── tokenizer_config.json
+├── vocab.txt
+├── README.md
+└── .gitattributes
+## How to Use the Model
+### Install Dependencies
 pip install transformers torch
+Optionally, install safetensors if you want to load the .safetensors file.
+### Load the Model via Hugging Face
+from transformers import AutoTokenizer, AutoModel
+repo_id = "row56/ProtoPatient"
+tokenizer = AutoTokenizer.from_pretrained(repo_id)
+model = AutoModel.from_pretrained(repo_id)
+model.eval()
+sample_text = "This patient presents with severe headaches and nausea..."
+inputs = tokenizer(sample_text, return_tensors="pt")
+outputs = model(**inputs)
+print("Output shape:", outputs.last_hidden_state.shape)
+### Interpreting Outputs
+For a prototypical classification approach, you would generally use the custom modules in proto_model/ (e.g., ProtoForMultiLabelClassification) and check which tokens are highly attended per label, as well as which “prototype patients” are most similar.
+If you’re using the standard AutoModel, you can still get the raw embeddings, but you will need the custom code for label-wise attention and prototype retrieval.
+### (Optional) Hugging Face Pipelines
+You can integrate the model into a pipeline (e.g., feature extraction) to simplify usage:
+from transformers import pipeline
+extractor = pipeline("feature-extraction", model=repo_id, tokenizer=repo_id)
+embeddings = extractor("Severe headaches and vomiting...")
+print(len(embeddings), len(embeddings[0]))  # Token-level features
+## Intended Use, Limitations & Ethical Considerations
+### Intended Use:
+- ProtoPatient is primarily for research and education in clinical NLP.
+- It demonstrates how to leverage prototype-based interpretability for multi-label classification on admission notes.
+### Limitations:
+- The model was trained on public ICU datasets (MIMIC-III, i2b2) and may not generalize to other patient populations.
+- It considers only one prototype per diagnosis in the currently released version; some diagnoses may have multiple typical presentations, which is an area for future research.
+- It does not explicitly model inter-diagnosis relationships (e.g., conflicts or comorbidities).
+### Ethical & Regulatory:
+- This model is not intended for direct clinical use. Always consult healthcare professionals for medical decisions.
+- Users must be aware of potential biases in the training data. Rare conditions could still be misclassified despite improvements.
+- Patient privacy must be strictly maintained if applying to real hospital data.
+## Example Interpretability Output
+Based on the approach in the paper (Section 5 and Table 5 there):
+- Highlighted tokens: Terms that strongly indicate a certain diagnosis (e.g., “worst headache of her life,” “vomiting,” “fever,” “infiltrate,” etc.).
+- Prototypical sample: A snippet from a training patient with very similar text segments (e.g., describing similar symptoms, risk factors, or diagnoses).
+This provides clinicians with rationales: “The system thinks your patient has intracerebral hemorrhage because they exhibit text segments similar to a previous patient who had that diagnosis.”
+## Recommended Citation
+If you use ProtoPatient in your research, please cite:
+@misc{vanaken2022this,
+  title={This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text},
+  author={
+    van Aken, Betty and Papaioannou, Jens-Michalis and Naik, Marcel G. and
+    Eleftheriadis, Georgios and Nejdl, Wolfgang and Gers, Felix A. and L{\"o}ser, Alexander
+  },
+  year={2022},
+  eprint={2210.08500},
+  archivePrefix={arXiv},
+  primaryClass={cs.CL}
+}