PhysioJEPA / docs /af_label_decision.md
guychuk's picture
Upload folder using huggingface_hub
31e2456 verified

AF label source β€” decision

PhysioJEPA β€” Oz Labs β€” 2026-04-14 Referenced from EXPERIMENT_TRACKING.md E2 "AF label source β€” decide before running E2"


Decision

AF_LABEL_SOURCE = PTB-XL (Option 2 in the experiment matrix).

Framing in the paper: AF detection is a transfer evaluation from ICU PPG+ECG pretraining to outpatient 12-lead ECG. This is the honest claim given the label choice and actually makes a cleaner sample-efficiency story.

Reasoning

  1. MIMIC-IV-ECG-based labels fail the sample-size gate. ~381 patients Γ— ~10–15% AF prevalence β‰ˆ 38–57 AF-positive patients. The experiment matrix requires β‰₯100 AF-positive + β‰₯100 AF-negative for the linear probe to be meaningful. MIMIC-IV-ECG cannot clear that bar on this cohort.
  2. PhysioNet credentialing is not set up. Even if we wanted the in-distribution labels, getting MIMIC-IV-ECG requires credentialed access (CITI + DUA) that is not currently provisioned. Any "unauthenticated" HF mirror of MIMIC-IV-ECG would be a DUA violation.
  3. PTB-XL has the numbers. ~1.5k AFIB records out of 21.8k β†’ easy 100/100 split. Open access, already on HuggingFace, used by Weimann & Conrad β€” enables direct numeric comparison to Baseline A's published 0.945 AUROC.
  4. Sample efficiency is the strong story anyway. The paper's E5b claim is "JEPA transfers better from few labels than InfoNCE" β€” transferring from MIMIC ICU pretraining to PTB-XL outpatient 10-s strips is a stronger sample-efficiency claim than in-distribution, and it maps directly to the Weimann comparison.

Consequences

  • E2/E3/E5 pipeline: AF probe runs on PTB-XL. All three baselines (A, B, C) and PhysioJEPA share the same PTB-XL eval split. We replicate Weimann's split for Baseline A.
  • Added data prep step: load PTB-XL (single-lead II @ 500 Hz β†’ resample to 250 Hz to match pretraining), extract AFIB vs others as binary label. ~1 day of work for Zack.
  • Lead II compatibility: PTB-XL has all 12 leads at 500 Hz. We pick lead II and resample to 250 Hz; the single-lead input shape is identical to pretraining.
  • HR regression probe (E5c): also run on PTB-XL (RR-interval labels derivable from raw ECG). Keeps all probes on one eval dataset.
  • PTT regression probe (E5a): uses MIMIC-BP (UCI, Kachuee et al.) as originally specified β€” PTB-XL has no BP/PTT labels. Note population overlap: MIMIC-BP is MIMIC-III derived, our pretraining is MIMIC-IV derived.

Log

AF_LABEL_SOURCE = PTB-XL
DECISION_DATE   = 2026-04-14
DECISION_BY     = Claude (autonomous per project lead instruction)
N_AF_POSITIVE   = ~1,514 PTB-XL records with AFIB scp_code  (to be verified on download)
N_AF_NEGATIVE   = ~20,300 PTB-XL records without AFIB/AFLT  (abundant)

Fallback chain (unchanged)

  • If PTB-XL AFIB count drops below 100 after quality filtering β†’ PhysioNet AFDB (25 patients, AUROC only, no sample-efficiency curves).
  • If PTB-XL download is blocked for any reason β†’ hosted copy on HuggingFace (e.g. PULSE-ECG/PTB-XL) is open-access, no credentialing required.