vlbthambawita commited on
Commit
e78f179
·
verified ·
1 Parent(s): a6371cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -3
README.md CHANGED
@@ -1,3 +1,121 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ ---
4
+ ## mesomorphicECG
5
+
6
+ ### Model overview
7
+
8
+ The **mesomorphicECG** repository hosts a family of binary ECG classification models trained on 12‑lead ECG signals at two sampling rates (100 Hz and 500 Hz). Each model predicts whether an ECG segment belongs to a normal control patient (`norm`) or to one of four diagnostic categories:
9
+
10
+ - **norm_vs_cd**: Normal vs coronary artery disease (CD)
11
+ - **norm_vs_hyp**: Normal vs hypertensive heart disease (HYP)
12
+ - **norm_vs_mi**: Normal vs myocardial infarction (MI)
13
+ - **norm_vs_sttc**: Normal vs ST‑T abnormalities (STTC)
14
+
15
+ Two architectural variants are provided:
16
+
17
+ - **Categorical IMN**: A multi‑layer “IMN transition net” that predicts a binary label.
18
+ - **Single‑Linear IMN**: A simplified variant with a single linear decision head (useful for interpretability analyses and probing feature representations).
19
+
20
+ Models are provided for both **100 Hz** and **500 Hz** sampling rates, yielding **4 (tasks) × 2 (architectures) × 2 (sampling rates) = 16 model configurations**.
21
+
22
+ ### Available checkpoints and structure
23
+
24
+ Checkpoints are organized in the repository as:
25
+
26
+ - **Categorical IMN 100 Hz**: `categorical_imn_100hz/<task>/`
27
+ - **Categorical IMN 500 Hz**: `categorical_imn_500hz/<task>/`
28
+ - **Single‑Linear IMN 100 Hz**: `single_linear_imn_100hz/<task>/`
29
+ - **Single‑Linear IMN 500 Hz**: `single_linear_imn_500hz/<task>/`
30
+
31
+ Where `<task>` is one of:
32
+
33
+ - `norm_vs_cd`
34
+ - `norm_vs_hyp`
35
+ - `norm_vs_mi`
36
+ - `norm_vs_sttc`
37
+
38
+ Each task directory typically contains:
39
+
40
+ - **`best-imn-epoch=E-val_auc=A.ckpt`**: Best validation checkpoint (by AUC).
41
+ - **`args.yaml`**: Training configuration and hyperparameters.
42
+ - **`metrics.csv`**: Summary metrics (e.g. accuracy, balanced accuracy, precision, recall, F1, MCC, AUROC) for the best model.
43
+
44
+ ### Intended use
45
+
46
+ - **Primary use**: Research on ECG‑based risk stratification, disease detection, and model interpretability.
47
+ - **Tasks**:
48
+ - Binary classification of individual ECG windows or segments.
49
+ - Comparison of model behavior across sampling rates (100 Hz vs 500 Hz).
50
+ - Comparison of full categorical vs single‑linear decision heads.
51
+ - **Users**:
52
+ - ML and signal processing researchers working on cardiovascular AI.
53
+ - Clinician‑scientists exploring interpretable ECG models.
54
+ - Developers building proof‑of‑concept ECG classification systems.
55
+
56
+ These models are **not** intended for direct clinical decision making without further validation and regulatory clearance.
57
+
58
+ ### Out‑of‑scope uses
59
+
60
+ - **Do not use for**:
61
+ - Real‑time clinical diagnosis or triage without extensive external validation.
62
+ - Deployment in medical devices or hospital systems without regulatory approval.
63
+ - Populations, devices, or acquisition protocols that differ significantly from the training data, unless carefully re‑evaluated.
64
+
65
+ ### Data
66
+
67
+ - **Input**:
68
+ - Multi‑lead ECG time series (e.g. 12 leads).
69
+ - Sampling rate: **100 Hz** or **500 Hz**, depending on the model.
70
+ - Input windows are fixed‑length ECG segments (exact window length and stride are defined in `args.yaml` for each checkpoint).
71
+ - **Labels**:
72
+ - Binary labels: `0` for normal, `1` for the target diagnostic group (CD / HYP / MI / STTC), per task.
73
+
74
+ The data used for training and validation consists of de‑identified ECG records. For details on cohort selection, preprocessing, and labeling, please refer to the associated project documentation or publication (if available) or contact the authors.
75
+
76
+ ### Training and architecture
77
+
78
+ - **Base architecture**:
79
+ - IMN‑based “transition net” encoder for ECG time series.
80
+ - Convolutional / temporal feature extraction followed by fully‑connected layers.
81
+ - **Variants**:
82
+ - **Categorical IMN**:
83
+ - Standard deep classifier with non‑linear layers before the output.
84
+ - **Single‑Linear IMN**:
85
+ - Same encoder, but with a single linear output layer (no hidden layers after the encoder) to support interpretability and linear probing.
86
+ - **Optimization**:
87
+ - Binary classification objective (e.g. cross‑entropy).
88
+ - Model selection based on **validation AUROC**; the script `upload_best_checkpoints_to_hf.py` selects the checkpoint with the highest `val_auc` across runs (see `metrics.csv`).
89
+
90
+ Exact hyperparameters (learning rate, batch size, input window length, etc.) are stored per‑run in the accompanying `args.yaml` files.
91
+
92
+ ### Evaluation
93
+
94
+ - **Metrics**:
95
+ - `accuracy`
96
+ - `balanced_accuracy`
97
+ - `precision`
98
+ - `recall`
99
+ - `f1_score`
100
+ - `mcc`
101
+ - `auroc` (used for model selection)
102
+ - **Performance**:
103
+ - The best checkpoints typically achieve **high AUROC (≈0.90–0.97)** on validation data, with task‑dependent variation.
104
+ - Per‑task, per‑model metrics are available in the corresponding `metrics.csv` files.
105
+
106
+ These metrics reflect performance on the internal validation splits and **may not generalize** to other datasets, institutions, or devices.
107
+
108
+ ### How to use
109
+
110
+ #### 1. Download a checkpoint
111
+
112
+ from huggingface_hub import hf_hub_download
113
+
114
+ # Example: Categorical IMN 500 Hz, norm_vs_mi
115
+ repo_id = "SEARCH-IHI/mesomorphicECG"
116
+ ckpt_path = hf_hub_download(
117
+ repo_id=repo_id,
118
+ filename="categorical_imn_500hz/norm_vs_mi/best-imn-epoch=18-val_auc=0.9555.ckpt", # adjust filename as in repo
119
+ )
120
+ print(ckpt_path)
121
+