Model Overview

Physio Transformer HRV is a self‑supervised transformer encoder trained on long‑duration ECG‑derived heart rate (HR) and heart rate variability (HRV) signals from the MIT‑BIH Atrial Fibrillation Database (AFDB). The model learns general‑purpose physiological representations using a masked heart‑rate reconstruction objective, similar to BERT‑style masked modeling for time series.

This encoder is designed as a foundation model for wearable‑style biometrics, capturing long‑range temporal patterns in HR/HRV dynamics such as circadian rhythms, autonomic balance, recovery, and arrhythmia‑like irregularity.

Intended Use

This model is intended for feature extraction and downstream fine‑tuning on physiological or wearable‑related tasks, including:

Stress detection
Sleep staging
Activity recognition
HR forecasting
Health anomaly detection
Arrhythmia or irregularity detection
Personalized biometrics modeling

The encoder outputs a sequence of 64‑dimensional embeddings for each timestep.

Training Objective

The model is trained using masked HR reconstruction:

15% of HR values are randomly masked
HRV + activity + unmasked HR are provided as input

The model predicts the missing HR values

This forces the encoder to learn:

HR + HRV relationships
Temporal dynamics
Autonomic patterns
Long‑range dependencies
Physiological variability

This is analogous to BERT pretraining, but for biometrics.

Architecture

Transformer encoder (Pre‑LayerNorm)

3 layers, 4 heads, 64‑dimensional model
Learned positional embeddings

Input channels:

HR (1)
HRV (1)
Activity (3 placeholder channels)

Output: (batch, time, d_model) embeddings

The model uses dropout, residual connections, and a corrected key_dim = d_model // num_heads.

Data

Training data is derived from the MIT‑BIH Atrial Fibrillation Database (AFDB):

~30 full‑length 10‑hour ECG recordings
R‑peaks extracted using a fast Pan‑Tompkins‑style detector
HR + HRV computed in sliding windows
Time‑series sliced into fixed‑length windows (T=128, step=32)
NaNs imputed per‑window

This produces thousands of training samples from a small number of long recordings.

Performance

The model is evaluated using masked MAE on held‑out validation windows.

Typical results:

Masked MAE: ~5 BPM
Median error: low
Stable reconstruction across a wide HR range

These metrics indicate that the encoder successfully learns physiological structure.

Downloads last month: -