Model Overview
Physio Transformer HRV is a self‑supervised transformer encoder trained on long‑duration ECG‑derived heart rate (HR) and heart rate variability (HRV) signals from the MIT‑BIH Atrial Fibrillation Database (AFDB). The model learns general‑purpose physiological representations using a masked heart‑rate reconstruction objective, similar to BERT‑style masked modeling for time series.
This encoder is designed as a foundation model for wearable‑style biometrics, capturing long‑range temporal patterns in HR/HRV dynamics such as circadian rhythms, autonomic balance, recovery, and arrhythmia‑like irregularity.
Intended Use
This model is intended for feature extraction and downstream fine‑tuning on physiological or wearable‑related tasks, including:
- Stress detection
- Sleep staging
- Activity recognition
- HR forecasting
- Health anomaly detection
- Arrhythmia or irregularity detection
- Personalized biometrics modeling
The encoder outputs a sequence of 64‑dimensional embeddings for each timestep.
Training Objective
The model is trained using masked HR reconstruction:
- 15% of HR values are randomly masked
- HRV + activity + unmasked HR are provided as input
The model predicts the missing HR values
This forces the encoder to learn:
- HR + HRV relationships
- Temporal dynamics
- Autonomic patterns
- Long‑range dependencies
- Physiological variability
This is analogous to BERT pretraining, but for biometrics.
Architecture
Transformer encoder (Pre‑LayerNorm)
- 3 layers, 4 heads, 64‑dimensional model
- Learned positional embeddings
Input channels:
- HR (1)
- HRV (1)
- Activity (3 placeholder channels)
Output: (batch, time, d_model) embeddings
The model uses dropout, residual connections, and a corrected key_dim = d_model // num_heads.
Data
Training data is derived from the MIT‑BIH Atrial Fibrillation Database (AFDB):
- ~30 full‑length 10‑hour ECG recordings
- R‑peaks extracted using a fast Pan‑Tompkins‑style detector
- HR + HRV computed in sliding windows
- Time‑series sliced into fixed‑length windows (T=128, step=32)
- NaNs imputed per‑window
This produces thousands of training samples from a small number of long recordings.
Performance
The model is evaluated using masked MAE on held‑out validation windows.
Typical results:
- Masked MAE: ~5 BPM
- Median error: low
- Stable reconstruction across a wide HR range
These metrics indicate that the encoder successfully learns physiological structure.
- Downloads last month
- -