row56
/

ProtoPatient

@@ -97,6 +97,127 @@ Additionally, the model achieves high transferability on **i2b2** data (1,118 ad
 ## Repository Structure
-ProtoPatient/ ├── proto_model/ │ ├── proto.py │ ├── utils.py │ ├── metrics.py │ └── init.py ├── config.json ├── model.safetensors ├── tokenizer.json ├── tokenizer_config.json ├── vocab.txt ├── README.md └── .gitattributes

 ## Repository Structure
+ProtoPatient/
+├── proto_model/
+│   ├── proto.py
+│   ├── utils.py
+│   ├── metrics.py
+│   └── __init__.py
+├── config.json
+├── model.safetensors
+├── tokenizer.json
+├── tokenizer_config.json
+├── vocab.txt
+├── README.md
+└── .gitattributes
+## How to Use the Model
+### 1. Install Dependencies
+```bash
+pip install transformers torch
+```
+### 2. Load the Model via Hugging Face
+```python
+from transformers import AutoTokenizer, AutoModel
+repo_id = "row56/ProtoPatient"
+tokenizer = AutoTokenizer.from_pretrained(repo_id)
+model = AutoModel.from_pretrained(repo_id)
+model.eval()
+sample_text = "This patient presents with severe headaches and nausea..."
+inputs = tokenizer(sample_text, return_tensors="pt")
+outputs = model(**inputs)
+print("Output shape:", outputs.last_hidden_state.shape)
+```
+## 3. Interpreting Outputs
+For a full prototypical classification workflow, use the custom modules in `proto_model/` (e.g., `ProtoForMultiLabelClassification`) to inspect:
+- Which tokens receive high attention for each diagnosis.
+- Which prototypical patients are retrieved as similar examples.
+Using the standard `AutoModel` returns raw embeddings; the custom code is required for full label-wise attention and prototype retrieval.
+---
+## 4. (Optional) Hugging Face Pipelines
+Integrate the model into a pipeline for feature extraction:
+```python
+from transformers import pipeline
+extractor = pipeline("feature-extraction", model=repo_id, tokenizer=repo_id)
+embeddings = extractor("Severe headaches and vomiting...")
+print(len(embeddings), len(embeddings[0]))  # Token-level feature vectors
+```
+# Intended Use, Limitations & Ethical Considerations
+## Intended Use
+- **Research & Education:**
+  ProtoPatient is designed primarily for academic research and educational purposes in clinical NLP.
+- **Interpretability Demonstration:**
+  The model demonstrates how prototype-based methods can provide interpretable multi-label classification on clinical admission notes.
+## Limitations
+- **Generalization:**
+  The model was trained on public ICU datasets (MIMIC-III, i2b2) and may not generalize to other patient populations.
+- **Prototype Scope:**
+  The current version uses a single prototype per diagnosis, though some diagnoses might have multiple typical presentations—this is an area for future improvement.
+- **Inter-diagnosis Relationships:**
+  The model does not explicitly model relationships (e.g., conflicts or comorbidities) between different diagnoses.
+## Ethical & Regulatory Considerations
+- **Not for Direct Clinical Use:**
+  This model is not intended for direct clinical decision-making. Always consult healthcare professionals.
+- **Bias and Fairness:**
+  Users should be aware of potential biases in the training data; rare conditions might still be misclassified.
+- **Patient Privacy:**
+  When applying the model to real clinical data, patient privacy must be strictly maintained.
+---
+# Example Interpretability Output
+Based on the approach described in the paper (see Section 5 and Table 5):
+- **Highlighted Tokens:**
+  Tokens such as “worst headache of her life,” “vomiting,” “fever,” and “infiltrate” strongly indicate specific diagnoses.
+- **Prototypical Sample:**
+  A snippet from a training patient with similar text segments provides a rationale for the prediction.
+*This interpretability output aids clinicians in understanding the model's reasoning – for example: "The system suggests intracerebral hemorrhage because the patient's note closely resembles typical cases with that diagnosis."*
+---
+# Recommended Citation
+If you use ProtoPatient in your research, please cite:
+```bibtex
+@misc{vanaken2022this,
+  title={This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text},
+  author={van Aken, Betty and Papaioannou, Jens-Michalis and Naik, Marcel G. and Eleftheriadis, Georgios and Nejdl, Wolfgang and Gers, Felix A. and L{\"o}ser, Alexander},
+  year={2022},
+  eprint={2210.08500},
+  archivePrefix={arXiv},
+  primaryClass={cs.CL}
+}