AF label source — decision

PhysioJEPA — Oz Labs — 2026-04-14 Referenced from EXPERIMENT_TRACKING.md E2 "AF label source — decide before running E2"

Decision

AF_LABEL_SOURCE = PTB-XL (Option 2 in the experiment matrix).

Framing in the paper: AF detection is a transfer evaluation from ICU PPG+ECG pretraining to outpatient 12-lead ECG. This is the honest claim given the label choice and actually makes a cleaner sample-efficiency story.

Reasoning

MIMIC-IV-ECG-based labels fail the sample-size gate. ~381 patients × ~10–15% AF prevalence ≈ 38–57 AF-positive patients. The experiment matrix requires ≥100 AF-positive + ≥100 AF-negative for the linear probe to be meaningful. MIMIC-IV-ECG cannot clear that bar on this cohort.
PhysioNet credentialing is not set up. Even if we wanted the in-distribution labels, getting MIMIC-IV-ECG requires credentialed access (CITI + DUA) that is not currently provisioned. Any "unauthenticated" HF mirror of MIMIC-IV-ECG would be a DUA violation.
PTB-XL has the numbers. ~1.5k AFIB records out of 21.8k → easy 100/100 split. Open access, already on HuggingFace, used by Weimann & Conrad — enables direct numeric comparison to Baseline A's published 0.945 AUROC.
Sample efficiency is the strong story anyway. The paper's E5b claim is "JEPA transfers better from few labels than InfoNCE" — transferring from MIMIC ICU pretraining to PTB-XL outpatient 10-s strips is a stronger sample-efficiency claim than in-distribution, and it maps directly to the Weimann comparison.

Consequences

E2/E3/E5 pipeline: AF probe runs on PTB-XL. All three baselines (A, B, C) and PhysioJEPA share the same PTB-XL eval split. We replicate Weimann's split for Baseline A.
Added data prep step: load PTB-XL (single-lead II @ 500 Hz → resample to 250 Hz to match pretraining), extract AFIB vs others as binary label. ~1 day of work for Zack.
Lead II compatibility: PTB-XL has all 12 leads at 500 Hz. We pick lead II and resample to 250 Hz; the single-lead input shape is identical to pretraining.
HR regression probe (E5c): also run on PTB-XL (RR-interval labels derivable from raw ECG). Keeps all probes on one eval dataset.
PTT regression probe (E5a): uses MIMIC-BP (UCI, Kachuee et al.) as originally specified — PTB-XL has no BP/PTT labels. Note population overlap: MIMIC-BP is MIMIC-III derived, our pretraining is MIMIC-IV derived.

Log

AF_LABEL_SOURCE = PTB-XL
DECISION_DATE   = 2026-04-14
DECISION_BY     = Claude (autonomous per project lead instruction)
N_AF_POSITIVE   = ~1,514 PTB-XL records with AFIB scp_code  (to be verified on download)
N_AF_NEGATIVE   = ~20,300 PTB-XL records without AFIB/AFLT  (abundant)

Fallback chain (unchanged)

If PTB-XL AFIB count drops below 100 after quality filtering → PhysioNet AFDB (25 patients, AUROC only, no sample-efficiency curves).
If PTB-XL download is blocked for any reason → hosted copy on HuggingFace (e.g. PULSE-ECG/PTB-XL) is open-access, no credentialing required.