wangrice's picture
Upload NRD ICD-10 outcome models
48cf028 verified
|
Raw
History Blame Contribute Delete
2.11 kB
---
library_name: keras
tags:
- healthcare
- icd-10
- mortality-prediction
- readmission-prediction
- deepset
- tabular
---
# NRD ICD-10 outcome models
DeepSet + (optional) Transformer models trained on the National Readmission
Database (NRD, 2016–2020) to predict two outcomes from up to 40 ICD-10
diagnosis codes plus demographics (AGE, FEMALE, PAY1, ZIPINC_QRTL).
| Subfolder | Outcome | Description |
|----------------------|----------|--------------------------------------|
| `mortality_30day/` | `MOR30` | 30-day post-discharge mortality |
| `readmission_30day/` | `REA30` | 30-day all-cause readmission |
`encoders/` holds the fitted `LabelEncoder` (ICD-10 β†’ integer IDs) and
`MinMaxScaler` (AGE) used at training time. Inputs must be encoded with the
**same** artifacts at inference, or predictions will be meaningless.
## Loading a model
The `.keras` files contain three custom serializable components
(`DeepSet`, `TransformerBlock`, `F2Score`) that must be importable (and
registered via `@tf.keras.utils.register_keras_serializable(package="Custom")`)
before `load_model`:
```python
import tensorflow as tf
from huggingface_hub import hf_hub_download
# Register your custom classes β€” see src/train/ in the source repo
from custom_layers import DeepSet, TransformerBlock, F2Score # noqa: F401
path = hf_hub_download(
repo_id="<user-or-org>/<repo-name>",
filename="mortality_30day/mort_hypertrial_auc.keras",
)
model = tf.keras.models.load_model(path)
```
## Variants
Within each outcome subfolder, file suffixes denote the architecture:
- `_hypertrial_auc` β€” best model from the Keras-Tuner search (recommended)
- `_icd_only` β€” ICD codes only, no demographics (ablation)
- `_no_deepset` β€” flattened ICD input, no DeepSet aggregation (ablation)
- `_with_transformers` / `_transformer` β€” DeepSet + TransformerBlocks
## Data restrictions
NRD is a HCUP product distributed under a Data Use Agreement. These weights
do not contain individual records, but downstream users should be aware of
the source.