File size: 4,367 Bytes
34da60e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 | ---
language: en
license: mit
tags:
- medical
- clinical-notes
- cardiac-arrest
- ohca
- biomedical-nlp
- transformers
- pubmedbert
library_name: transformers
pipeline_tag: text-classification
---
# OHCA Classifier V11: Temporal + Location-Aware Model
## Model Description
A transformer-based deep learning model for automatically identifying Out-of-Hospital Cardiac Arrest (OHCA) cases from clinical notes.
**Key Innovation:** Combines semantic understanding (PubMedBERT) with explicit location and temporal features to distinguish OHCA from in-hospital cardiac arrest (IHCA).
## Training Data
- **Dataset**: MIMIC-III clinical notes
- **Size**: 330 notes (47 OHCA, 283 Non-OHCA)
- **Split**: 70% train / 15% validation / 15% test
- **Average note length**: 13,042 characters
## Performance (C19 Validation - 647 notes)
| Metric | Score |
|--------|-------|
| **Sensitivity** | 92.1% |
| **Specificity** | 89.4% |
| **Precision** | 79.9% |
| **F1-Score** | 0.856 |
| **AUC-ROC** | 0.956 |
## Model Architecture
**Base Model**: `microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract`
**Input Features (775 dimensions):**
- BERT embeddings: 768
- Location features: 2
- OHCA location indicator count (22 phrases)
- IHCA location indicator count (25 phrases)
- Temporal features: 5
- Arrest timing score (when arrest occurred)
- First location outside hospital (binary)
- First location inside hospital (binary)
- Movement outside→inside count
- Movement inside→inside count
**Classifier**: 3-layer MLP (775 → 512 → 256 → 2)
## Key Features
### Location Features
**OHCA indicators**: home, EMS, scene, field, bystander, ambulance, paramedics, etc.
**IHCA indicators**: floor, ICU, ward, room, bed, code blue, admitted, telemetry, etc.
### Temporal Features
Captures the **story** of what happened:
- **When**: Before arrival vs during hospitalization
- **Where it started**: First location mentioned (inside/outside)
- **How patient moved**: Direction of transitions (outside→inside vs inside→inside)
## Usage
```python
# Note: Requires custom model class and feature extraction
# See model files for implementation details
from transformers import AutoTokenizer
import torch
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("monajm36/ohca-classifier-v11")
# Example clinical note
note = """
Patient found unresponsive at home by family. 911 called.
EMS arrived, initiated CPR. ROSC achieved in field.
Transported to ED.
"""
# Extract features (requires custom code)
# location_features = extract_location_features(note)
# temporal_features = extract_temporal_features(note)
# Tokenize
inputs = tokenizer(note, return_tensors="pt", max_length=512, truncation=True)
# Predict (requires loading custom model architecture)
# ...
```
## Threshold Selection
Choose threshold based on your clinical use case:
| Use Case | Threshold | Sensitivity | Specificity | F1 |
|----------|-----------|-------------|-------------|-----|
| **Screening (High Recall)** | 0.14 | 92.1% | 89.4% | 0.856 |
| **Balanced** | 0.74 | 82.3% | 93.2% | 0.831 |
| **Research (High Precision)** | 0.85 | 75.4% | 95.0% | 0.810 |
## Limitations
- Trained on single institution (MIMIC-III)
- May not generalize to all clinical documentation styles
- IHCA false positive rate: ~28.5% at optimal threshold
- Requires feature extraction code (not included in model weights)
- Best performance on notes with clear EMS or location context
## Model Versions
This is **Version 11** - the latest and most accurate version.
| Version | Key Features | F1-Score |
|---------|--------------|----------|
| V9 | BERT only | 0.732 |
| V10 | + Location features | 0.814 |
| **V11** | **+ Temporal features** | **0.856** |
## Citation
```bibtex
@misc{moukaddem2025ohca,
author = {Moukaddem, Mona},
title = {OHCA Classifier V11: Temporal and Location-Aware Model for Out-of-Hospital Cardiac Arrest Identification},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/monajm36/ohca-classifier-v11}}
}
```
## Contact
For questions, issues, or collaboration opportunities, please open an issue on the model repository.
## Model Card Authors
Mona Moukaddem
## Acknowledgments
- Training data: MIMIC-III Clinical Database
- Validation data: UChicago C19 dataset
- Base model: Microsoft BiomedNLP-PubMedBERT
|