Upload README.md with huggingface_hub

34da60e verified 2 months ago

4.37 kB

	---
	language: en
	license: mit
	tags:
	- medical
	- clinical-notes
	- cardiac-arrest
	- ohca
	- biomedical-nlp
	- transformers
	- pubmedbert
	library_name: transformers
	pipeline_tag: text-classification
	---

	# OHCA Classifier V11: Temporal + Location-Aware Model

	## Model Description

	A transformer-based deep learning model for automatically identifying Out-of-Hospital Cardiac Arrest (OHCA) cases from clinical notes.

	Key Innovation: Combines semantic understanding (PubMedBERT) with explicit location and temporal features to distinguish OHCA from in-hospital cardiac arrest (IHCA).

	## Training Data

	- Dataset: MIMIC-III clinical notes
	- Size: 330 notes (47 OHCA, 283 Non-OHCA)
	- Split: 70% train / 15% validation / 15% test
	- Average note length: 13,042 characters

	## Performance (C19 Validation - 647 notes)

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Sensitivity \| 92.1% \|
	\| Specificity \| 89.4% \|
	\| Precision \| 79.9% \|
	\| F1-Score \| 0.856 \|
	\| AUC-ROC \| 0.956 \|

	## Model Architecture

	Base Model: `microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract`

	Input Features (775 dimensions):
	- BERT embeddings: 768
	- Location features: 2
	- OHCA location indicator count (22 phrases)
	- IHCA location indicator count (25 phrases)
	- Temporal features: 5
	- Arrest timing score (when arrest occurred)
	- First location outside hospital (binary)
	- First location inside hospital (binary)
	- Movement outside→inside count
	- Movement inside→inside count

	Classifier: 3-layer MLP (775 → 512 → 256 → 2)

	## Key Features

	### Location Features
	OHCA indicators: home, EMS, scene, field, bystander, ambulance, paramedics, etc.

	IHCA indicators: floor, ICU, ward, room, bed, code blue, admitted, telemetry, etc.

	### Temporal Features
	Captures the story of what happened:
	- When: Before arrival vs during hospitalization
	- Where it started: First location mentioned (inside/outside)
	- How patient moved: Direction of transitions (outside→inside vs inside→inside)

	## Usage
	```python
	# Note: Requires custom model class and feature extraction
	# See model files for implementation details

	from transformers import AutoTokenizer
	import torch

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained("monajm36/ohca-classifier-v11")

	# Example clinical note
	note = """
	Patient found unresponsive at home by family. 911 called.
	EMS arrived, initiated CPR. ROSC achieved in field.
	Transported to ED.
	"""

	# Extract features (requires custom code)
	# location_features = extract_location_features(note)
	# temporal_features = extract_temporal_features(note)

	# Tokenize
	inputs = tokenizer(note, return_tensors="pt", max_length=512, truncation=True)

	# Predict (requires loading custom model architecture)
	# ...
	```

	## Threshold Selection

	Choose threshold based on your clinical use case:

	\| Use Case \| Threshold \| Sensitivity \| Specificity \| F1 \|
	\|----------\|-----------\|-------------\|-------------\|-----\|
	\| Screening (High Recall) \| 0.14 \| 92.1% \| 89.4% \| 0.856 \|
	\| Balanced \| 0.74 \| 82.3% \| 93.2% \| 0.831 \|
	\| Research (High Precision) \| 0.85 \| 75.4% \| 95.0% \| 0.810 \|

	## Limitations

	- Trained on single institution (MIMIC-III)
	- May not generalize to all clinical documentation styles
	- IHCA false positive rate: ~28.5% at optimal threshold
	- Requires feature extraction code (not included in model weights)
	- Best performance on notes with clear EMS or location context

	## Model Versions

	This is Version 11 - the latest and most accurate version.

	\| Version \| Key Features \| F1-Score \|
	\|---------\|--------------\|----------\|
	\| V9 \| BERT only \| 0.732 \|
	\| V10 \| + Location features \| 0.814 \|
	\| V11 \| + Temporal features \| 0.856 \|

	## Citation
	```bibtex
	@misc{moukaddem2025ohca,
	author = {Moukaddem, Mona},
	title = {OHCA Classifier V11: Temporal and Location-Aware Model for Out-of-Hospital Cardiac Arrest Identification},
	year = {2025},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/monajm36/ohca-classifier-v11}}
	}
	```

	## Contact

	For questions, issues, or collaboration opportunities, please open an issue on the model repository.

	## Model Card Authors

	Mona Moukaddem

	## Acknowledgments

	- Training data: MIMIC-III Clinical Database
	- Validation data: UChicago C19 dataset
	- Base model: Microsoft BiomedNLP-PubMedBERT