row56
/

ProtoPatient

@@ -29,129 +29,74 @@ model-index:
 # ProtoPatient Model for Multi-Label Classification
-## Paper Reference:
-van Aken, Betty, Jens-Michalis Papaioannou, Marcel G. Naik, Georgios Eleftheriadis, Wolfgang Nejdl, Felix A. Gers, and Alexander Löser. 2022.
-"This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text."
-(arXiv:2210.08500)
-ProtoPatient is a transformer-based architecture that uses prototypical networks and label-wise attention to provide multi-label classification on clinical admission notes. Unlike standard black-box models, ProtoPatient offers inherent interpretability:
-- It highlights the most relevant tokens for each possible diagnosis.
-- It retrieves prototypical patients from the training set who exhibit similar textual patterns, providing intuitive justifications to clinicians: “This patient looks like that patient.”
 ## Model Overview
 ### Prototype-Based Classification
-#### Prototypical Vectors
-- The model learns prototypical vectors (uc) for each prototypical diagnosis c.
-#### Diagnosis-specific Representation
-- A patient’s admission note is mapped (via a PubMedBERT encoder plus a linear compression layer) to a diagnosis-specific representation (), generated by a label-wise attention mechanism.
-#### Classification Scores
-- Classification scores are computed via the negative Euclidean distance between  and , yielding a direct measure of “this note’s similarity to the learned prototype.”
 ### Label-Wise Attention
 - For each diagnosis, a separate attention vector identifies relevant tokens in the admission note.
-- This yields interpretability: the most “attended-to” tokens are presumably the evidence driving each diagnosis prediction.
 ### Interpretable Output
-#### Token highlights:
-- The top attended words, which often correlate with symptoms, risk factors, or diagnostic descriptors.
-#### Prototypical Patients:
-- The training examples closest to each prototype, exemplifying typical presentations of a diagnosis.
 ## Key Features and Benefits
-#### Improved Performance on Rare Diagnoses:
-- ProtoPatient leverages prototype-based learning, which has shown strong few-shot behavior, especially beneficial for diagnoses with very few samples.
-#### Faithful Interpretations:
-- A quantitative study (see paper, Section 5) shows that ProtoPatient’s attention-based highlights are more faithful to the model’s true decision process compared to post-hoc explainers (like Lime, Occlusion, or Grad-based methods).
-#### Clinical Utility:
-- Offers label-wise explanations to help clinicians quickly assess whether the system’s reasoning aligns with actual risk factors.
-- Points out prototypical patients, allowing doctors to compare and contrast new admissions with typical (or atypical) presentations.
 ## Performance Metrics
-Evaluated on MIMIC-III (48,745 admission notes, 1,266 diagnosis labels):
-- Macro ROC AUC: ~87–88%
-- Micro ROC AUC: ~97%
-- Macro PR AUC: ~18–21%
-Performance gains are particularly strong for rare diagnoses (fewer than 50 samples) compared to baselines such as PubMedBERT alone or hierarchical attention RNNs (HAN, HA-GRU).
-Additionally tested on i2b2 data (1,118 admission notes), achieving high transferability across different clinical environments.
-(Refer to Tables 1, 2, and 3 in the paper for detailed results and ablation studies.)
 ## Repository Structure
-ProtoPatient/
-├── proto_model/
-│   ├── proto.py
-│   ├── utils.py
-│   ├── metrics.py
-│   ├── __init__.py
-├── config.json
-├── model.safetensors
-├── tokenizer.json
-├── tokenizer_config.json
-├── vocab.txt
-├── README.md
-└── .gitattributes
-## How to Use the Model
-### Install Dependencies
-pip install transformers torch
-Optionally, install safetensors if you want to load the .safetensors file.
-### Load the Model via Hugging Face
-from transformers import AutoTokenizer, AutoModel
-repo_id = "row56/ProtoPatient"
-tokenizer = AutoTokenizer.from_pretrained(repo_id)
-model = AutoModel.from_pretrained(repo_id)
-model.eval()
-sample_text = "This patient presents with severe headaches and nausea..."
-inputs = tokenizer(sample_text, return_tensors="pt")
-outputs = model(**inputs)
-print("Output shape:", outputs.last_hidden_state.shape)
-### Interpreting Outputs
-For a prototypical classification approach, you would generally use the custom modules in proto_model/ (e.g., ProtoForMultiLabelClassification) and check which tokens are highly attended per label, as well as which “prototype patients” are most similar.
-If you’re using the standard AutoModel, you can still get the raw embeddings, but you will need the custom code for label-wise attention and prototype retrieval.
-### (Optional) Hugging Face Pipelines
-You can integrate the model into a pipeline (e.g., feature extraction) to simplify usage:
-from transformers import pipeline
-extractor = pipeline("feature-extraction", model=repo_id, tokenizer=repo_id)
-embeddings = extractor("Severe headaches and vomiting...")
-print(len(embeddings), len(embeddings[0]))  # Token-level features
-## Intended Use, Limitations & Ethical Considerations
-### Intended Use:
-- ProtoPatient is primarily for research and education in clinical NLP.
-- It demonstrates how to leverage prototype-based interpretability for multi-label classification on admission notes.
-### Limitations:
-- The model was trained on public ICU datasets (MIMIC-III, i2b2) and may not generalize to other patient populations.
-- It considers only one prototype per diagnosis in the currently released version; some diagnoses may have multiple typical presentations, which is an area for future research.
-- It does not explicitly model inter-diagnosis relationships (e.g., conflicts or comorbidities).
-### Ethical & Regulatory:
-- This model is not intended for direct clinical use. Always consult healthcare professionals for medical decisions.
-- Users must be aware of potential biases in the training data. Rare conditions could still be misclassified despite improvements.
-- Patient privacy must be strictly maintained if applying to real hospital data.
-## Example Interpretability Output
-Based on the approach in the paper (Section 5 and Table 5 there):
-- Highlighted tokens: Terms that strongly indicate a certain diagnosis (e.g., “worst headache of her life,” “vomiting,” “fever,” “infiltrate,” etc.).
-- Prototypical sample: A snippet from a training patient with very similar text segments (e.g., describing similar symptoms, risk factors, or diagnoses).
-This provides clinicians with rationales: “The system thinks your patient has intracerebral hemorrhage because they exhibit text segments similar to a previous patient who had that diagnosis.”
-## Recommended Citation
-If you use ProtoPatient in your research, please cite:
-@misc{vanaken2022this,
-  title={This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text},
-  author={
-    van Aken, Betty and Papaioannou, Jens-Michalis and Naik, Marcel G. and
-    Eleftheriadis, Georgios and Nejdl, Wolfgang and Gers, Felix A. and L{\"o}ser, Alexander
-  },
-  year={2022},
-  eprint={2210.08500},
-  archivePrefix={arXiv},
-  primaryClass={cs.CL}
-}

 # ProtoPatient Model for Multi-Label Classification
+## Paper Reference
+**van Aken, Betty, Jens-Michalis Papaioannou, Marcel G. Naik, Georgios Eleftheriadis, Wolfgang Nejdl, Felix A. Gers, and Alexander Löser. 2022.**
+*This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text.*
+[arXiv:2210.08500](https://arxiv.org/abs/2210.08500)
+ProtoPatient is a transformer-based architecture that uses prototypical networks and label-wise attention to provide multi-label classification on clinical admission notes. Unlike standard black-box models, ProtoPatient offers inherent interpretability by:
+- **Highlighting Relevant Tokens:** Shows the most important words for each possible diagnosis.
+- **Retrieving Prototypical Patients:** Finds training examples with similar textual patterns to provide intuitive justifications for clinicians—essentially answering, “This patient looks like that patient.”
+---
 ## Model Overview
 ### Prototype-Based Classification
+- The model learns **prototypical vectors** (\(u_c\)) for each diagnosis \(c\).
+- A patient’s admission note is encoded via a PubMedBERT encoder and a linear compression layer into a diagnosis-specific representation (\(v_{p,c}\)). This representation is generated using a label-wise attention mechanism.
+- Classification scores are computed as the **negative Euclidean distance** between \(v_{p,c}\) and \(u_c\), which directly measures the note’s similarity to the learned prototype.
 ### Label-Wise Attention
 - For each diagnosis, a separate attention vector identifies relevant tokens in the admission note.
+- This mechanism provides interpretability by indicating which tokens are most influential in driving each prediction.
 ### Interpretable Output
+- **Token Highlights:** The top attended words (often correlating with symptoms, risk factors, or diagnostic descriptors).
+- **Prototypical Patients:** Examples from the training set that are closest to each prototype, representing typical presentations of a diagnosis.
+---
 ## Key Features and Benefits
+- **Improved Performance on Rare Diagnoses:**
+  Prototype-based learning has strong few-shot capabilities, which is especially beneficial for diagnoses with very few samples.
+- **Faithful Interpretations:**
+  Quantitative evaluations (see Section 5 in the paper) indicate that the attention-based highlights are more faithful to the model’s decision process compared to post-hoc methods such as Lime, Occlusion, and gradient-based approaches.
+- **Clinical Utility:**
+  - Provides label-wise explanations to help clinicians assess whether the predictions align with actual risk factors.
+  - Points to prototypical patients, allowing for comparison of new cases with typical (or atypical) presentations.
+---
 ## Performance Metrics
+Evaluated on **MIMIC-III**:
+- **Admission Notes:** 48,745
+- **Diagnosis Labels:** 1,266
+Performance (approximate):
+- **Macro ROC AUC:** ~87–88%
+- **Micro ROC AUC:** ~97%
+- **Macro PR AUC:** ~18–21%
+The model shows particularly strong gains for rare diagnoses (less than 50 samples) when compared with baselines like PubMedBERT alone or hierarchical attention RNNs (e.g., HAN, HA-GRU).
+Additionally, the model achieves high transferability on **i2b2** data (1,118 admission notes) across different clinical environments.
+*Refer to Tables 1, 2, and 3 in the paper for detailed results and ablation studies.*
+---
 ## Repository Structure
+ProtoPatient/ ├── proto_model/ │ ├── proto.py │ ├── utils.py │ ├── metrics.py │ └── init.py ├── config.json ├── model.safetensors ├── tokenizer.json ├── tokenizer_config.json ├── vocab.txt ├── README.md └── .gitattributes