| # AF label source β decision |
| *PhysioJEPA β Oz Labs β 2026-04-14* |
| *Referenced from `EXPERIMENT_TRACKING.md` E2 "AF label source β decide before running E2"* |
|
|
| --- |
|
|
| ## Decision |
|
|
| **AF_LABEL_SOURCE = PTB-XL** (Option 2 in the experiment matrix). |
|
|
| **Framing in the paper**: AF detection is a *transfer* evaluation from ICU PPG+ECG pretraining to outpatient 12-lead ECG. This is the honest claim given the label choice and actually makes a cleaner sample-efficiency story. |
|
|
| ## Reasoning |
|
|
| 1. **MIMIC-IV-ECG-based labels fail the sample-size gate.** ~381 patients Γ ~10β15% AF prevalence β 38β57 AF-positive patients. The experiment matrix requires β₯100 AF-positive + β₯100 AF-negative for the linear probe to be meaningful. MIMIC-IV-ECG cannot clear that bar on this cohort. |
| 2. **PhysioNet credentialing is not set up.** Even if we wanted the in-distribution labels, getting MIMIC-IV-ECG requires credentialed access (CITI + DUA) that is not currently provisioned. Any "unauthenticated" HF mirror of MIMIC-IV-ECG would be a DUA violation. |
| 3. **PTB-XL has the numbers.** ~1.5k `AFIB` records out of 21.8k β easy 100/100 split. Open access, already on HuggingFace, used by Weimann & Conrad β enables *direct numeric comparison* to Baseline A's published 0.945 AUROC. |
| 4. **Sample efficiency is the strong story anyway.** The paper's E5b claim is "JEPA transfers better from few labels than InfoNCE" β transferring from MIMIC ICU pretraining to PTB-XL outpatient 10-s strips is a stronger sample-efficiency claim than in-distribution, and it maps directly to the Weimann comparison. |
|
|
| ## Consequences |
|
|
| - **E2/E3/E5 pipeline**: AF probe runs on PTB-XL. All three baselines (A, B, C) and PhysioJEPA share the same PTB-XL eval split. We replicate Weimann's split for Baseline A. |
| - **Added data prep step**: load PTB-XL (single-lead II @ 500 Hz β resample to 250 Hz to match pretraining), extract `AFIB` vs others as binary label. ~1 day of work for Zack. |
| - **Lead II compatibility**: PTB-XL has all 12 leads at 500 Hz. We pick lead II and resample to 250 Hz; the single-lead input shape is identical to pretraining. |
| - **HR regression probe (E5c)**: also run on PTB-XL (RR-interval labels derivable from raw ECG). Keeps all probes on one eval dataset. |
| - **PTT regression probe (E5a)**: uses MIMIC-BP (UCI, Kachuee et al.) as originally specified β PTB-XL has no BP/PTT labels. Note population overlap: MIMIC-BP is MIMIC-III derived, our pretraining is MIMIC-IV derived. |
|
|
| ## Log |
|
|
| ``` |
| AF_LABEL_SOURCE = PTB-XL |
| DECISION_DATE = 2026-04-14 |
| DECISION_BY = Claude (autonomous per project lead instruction) |
| N_AF_POSITIVE = ~1,514 PTB-XL records with AFIB scp_code (to be verified on download) |
| N_AF_NEGATIVE = ~20,300 PTB-XL records without AFIB/AFLT (abundant) |
| ``` |
|
|
| ## Fallback chain (unchanged) |
|
|
| - If PTB-XL AFIB count drops below 100 after quality filtering β **PhysioNet AFDB** (25 patients, AUROC only, no sample-efficiency curves). |
| - If PTB-XL download is blocked for any reason β hosted copy on HuggingFace (e.g. `PULSE-ECG/PTB-XL`) is open-access, no credentialing required. |
|
|