|
|
--- |
|
|
language: en |
|
|
license: mit |
|
|
tags: |
|
|
- medical |
|
|
- clinical-notes |
|
|
- cardiac-arrest |
|
|
- ohca |
|
|
- biomedical-nlp |
|
|
- transformers |
|
|
- pubmedbert |
|
|
library_name: transformers |
|
|
pipeline_tag: text-classification |
|
|
--- |
|
|
|
|
|
# OHCA Classifier V11: Temporal + Location-Aware Model |
|
|
|
|
|
## Model Description |
|
|
|
|
|
A transformer-based deep learning model for automatically identifying Out-of-Hospital Cardiac Arrest (OHCA) cases from clinical notes. |
|
|
|
|
|
**Key Innovation:** Combines semantic understanding (PubMedBERT) with explicit location and temporal features to distinguish OHCA from in-hospital cardiac arrest (IHCA). |
|
|
|
|
|
## Training Data |
|
|
|
|
|
- **Dataset**: MIMIC-III clinical notes |
|
|
- **Size**: 330 notes (47 OHCA, 283 Non-OHCA) |
|
|
- **Split**: 70% train / 15% validation / 15% test |
|
|
- **Average note length**: 13,042 characters |
|
|
|
|
|
## Performance (C19 Validation - 647 notes) |
|
|
|
|
|
| Metric | Score | |
|
|
|--------|-------| |
|
|
| **Sensitivity** | 92.1% | |
|
|
| **Specificity** | 89.4% | |
|
|
| **Precision** | 79.9% | |
|
|
| **F1-Score** | 0.856 | |
|
|
| **AUC-ROC** | 0.956 | |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
**Base Model**: `microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract` |
|
|
|
|
|
**Input Features (775 dimensions):** |
|
|
- BERT embeddings: 768 |
|
|
- Location features: 2 |
|
|
- OHCA location indicator count (22 phrases) |
|
|
- IHCA location indicator count (25 phrases) |
|
|
- Temporal features: 5 |
|
|
- Arrest timing score (when arrest occurred) |
|
|
- First location outside hospital (binary) |
|
|
- First location inside hospital (binary) |
|
|
- Movement outside→inside count |
|
|
- Movement inside→inside count |
|
|
|
|
|
**Classifier**: 3-layer MLP (775 → 512 → 256 → 2) |
|
|
|
|
|
## Key Features |
|
|
|
|
|
### Location Features |
|
|
**OHCA indicators**: home, EMS, scene, field, bystander, ambulance, paramedics, etc. |
|
|
|
|
|
**IHCA indicators**: floor, ICU, ward, room, bed, code blue, admitted, telemetry, etc. |
|
|
|
|
|
### Temporal Features |
|
|
Captures the **story** of what happened: |
|
|
- **When**: Before arrival vs during hospitalization |
|
|
- **Where it started**: First location mentioned (inside/outside) |
|
|
- **How patient moved**: Direction of transitions (outside→inside vs inside→inside) |
|
|
|
|
|
## Usage |
|
|
```python |
|
|
# Note: Requires custom model class and feature extraction |
|
|
# See model files for implementation details |
|
|
|
|
|
from transformers import AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("monajm36/ohca-classifier-v11") |
|
|
|
|
|
# Example clinical note |
|
|
note = """ |
|
|
Patient found unresponsive at home by family. 911 called. |
|
|
EMS arrived, initiated CPR. ROSC achieved in field. |
|
|
Transported to ED. |
|
|
""" |
|
|
|
|
|
# Extract features (requires custom code) |
|
|
# location_features = extract_location_features(note) |
|
|
# temporal_features = extract_temporal_features(note) |
|
|
|
|
|
# Tokenize |
|
|
inputs = tokenizer(note, return_tensors="pt", max_length=512, truncation=True) |
|
|
|
|
|
# Predict (requires loading custom model architecture) |
|
|
# ... |
|
|
``` |
|
|
|
|
|
## Threshold Selection |
|
|
|
|
|
Choose threshold based on your clinical use case: |
|
|
|
|
|
| Use Case | Threshold | Sensitivity | Specificity | F1 | |
|
|
|----------|-----------|-------------|-------------|-----| |
|
|
| **Screening (High Recall)** | 0.14 | 92.1% | 89.4% | 0.856 | |
|
|
| **Balanced** | 0.74 | 82.3% | 93.2% | 0.831 | |
|
|
| **Research (High Precision)** | 0.85 | 75.4% | 95.0% | 0.810 | |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained on single institution (MIMIC-III) |
|
|
- May not generalize to all clinical documentation styles |
|
|
- IHCA false positive rate: ~28.5% at optimal threshold |
|
|
- Requires feature extraction code (not included in model weights) |
|
|
- Best performance on notes with clear EMS or location context |
|
|
|
|
|
## Model Versions |
|
|
|
|
|
This is **Version 11** - the latest and most accurate version. |
|
|
|
|
|
| Version | Key Features | F1-Score | |
|
|
|---------|--------------|----------| |
|
|
| V9 | BERT only | 0.732 | |
|
|
| V10 | + Location features | 0.814 | |
|
|
| **V11** | **+ Temporal features** | **0.856** | |
|
|
|
|
|
## Citation |
|
|
```bibtex |
|
|
@misc{moukaddem2025ohca, |
|
|
author = {Moukaddem, Mona}, |
|
|
title = {OHCA Classifier V11: Temporal and Location-Aware Model for Out-of-Hospital Cardiac Arrest Identification}, |
|
|
year = {2025}, |
|
|
publisher = {Hugging Face}, |
|
|
howpublished = {\url{https://huggingface.co/monajm36/ohca-classifier-v11}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Contact |
|
|
|
|
|
For questions, issues, or collaboration opportunities, please open an issue on the model repository. |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
Mona Moukaddem |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Training data: MIMIC-III Clinical Database |
|
|
- Validation data: UChicago C19 dataset |
|
|
- Base model: Microsoft BiomedNLP-PubMedBERT |
|
|
|