File size: 8,337 Bytes
8f8716a 78f63ab 8f8716a 3bfda24 8f8716a 78f63ab 8f8716a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 | ---
license: mit
tags:
- sleep
- eeg
- polysomnography
- foundation-model
- self-supervised
- vit
- biosignals
pipeline_tag: feature-extraction
library_name: pytorch
language:
- en
---
# OSF: On Pre-training and Scaling of Sleep Foundation Models
[](#citation)
[](https://yang-ai-lab.github.io/osf/)
[](LICENSE)
[](#installation)
## π₯ News
- [2026-2-24] Our codebase and checkpoint is released.
- [2026-2-22] Our paper is out.
## π Introduction
Polysomnography (PSG) provides the gold standard for sleep assessment but suffers from substantial heterogeneity across recording devices and cohorts.
There have been growing efforts to build general-purpose foundation models (FMs) for sleep physiology, but lack an in-depth understanding of the pre-training process and scaling patterns that lead to more generalizable sleep FMs.
To fill this gap, we curate a massive corpus of 166,500 hours of sleep recordings from nine public sources and establish SleepBench, a comprehensive, fully open-source benchmark.
Leveraging SleepBench, we systematically evaluate four families of self-supervised pre-training objectives and uncover three critical findings:
(1) existing FMs fail to generalize to missing channels at inference;
(2) channel-invariant feature learning is essential for pre-training;
and (3) scaling sample size, model capacity, and multi-source data mixture consistently improves downstream performance.
With an enhanced pre-training and scaling recipe, we introduce OSF, a family of sleep FMs that achieves state-of-the-art performance across nine datasets on diverse sleep and disease prediction tasks.
Further analysis of OSF also reveals intriguing properties in sample efficiency, hierarchical aggregation, and cross-dataset scaling.
## π Table of Contents
1. [Installation](#-installation)
2. [Quick Start](#-quick-start)
3. [Pretrained Weights](#-pretrained-weights)
4. [Usage](#-usage)
5. [Benchmark Evaluations](#-benchmark-evaluations)
6. [Supported Datasets](#-supported-datasets)
7. [Citation](#-citation)
## πΏ Installation
```bash
git clone https://huggingface.co/yang-ai-lab/OSF-Base
cd OSF-Base
conda env create -f environment.yml
conda activate myenv
```
### Dependencies
- Python >= 3.10
- PyTorch >= 2.9.0
- PyTorch Lightning >= 2.5.5
## π Quick Start
We provide a demo notebook (`demo.ipynb`) demonstrating how to extract embeddings from PSG signals using the pretrained model.
```python
import torch
from osf.backbone.vit1d_cls import vit_base
# Load pretrained weights (included in this repo)
payload = torch.load("osf_backbone.pth", map_location="cpu", weights_only=False)
meta = payload["metadata"]
# Initialize model
backbone = vit_base(
num_leads=meta["num_leads"], # 12 channels
seq_len=meta["seq_len"], # 1920 (64 Hz Γ 30 s)
patch_size=meta["patch_size_time"],
lead_wise=meta["lead_wise"],
patch_size_ch=meta["patch_size_ch"],
)
backbone.load_state_dict(payload["state_dict"])
backbone.eval()
# Extract embeddings
# x: [B, 12, 1920] - 12-channel PSG, 64 Hz Γ 30 seconds
with torch.no_grad():
cls_embs, patch_embs = backbone.forward_encoding(x, return_sequence=False)
# cls_embs: [B, 768] - Global epoch-level representation
# patch_embs: [B, 90, 768] - Local patch representations
```
## π¦ Pretrained Weights
| Model | Backbone | Channels |
|-------|----------|----------|
| OSF | ViT-Base | 12-ch |
The pretrained weights are included in this repository. You can download them via the Hugging Face Hub:
```python
from huggingface_hub import hf_hub_download
checkpoint_path = hf_hub_download(repo_id="yang-ai-lab/OSF-Base", filename="osf_backbone.pth")
```
Or via the CLI:
```bash
huggingface-cli download yang-ai-lab/OSF-Base osf_backbone.pth
```
## π©βπ» Usage
### Input Format
Expected input format:
- **12 PSG Channels**: ECG, EMG_Chin, EMG_LLeg, EMG_RLeg, ABD, THX, NP, SN, EOG_E1_A2, EOG_E2_A1, EEG_C3_A2, EEG_C4_A1
- **Sample Rate**: 64 Hz
- **Epoch Length**: 30 seconds
- **Input Shape**: `[B, 12, 1920]`
### Pretraining
We support multiple self-supervised pretraining methods, for example, to launch pre-training of our OSF method, run pretraining:
```bash
python main_pretrain.py \
--model_name "dino_ours" \
--psg_encoder_name "vit_base" \
--batch_size 256 \
--lr 5e-5 \
--max_epochs 30 \
--num_devices 4 \
--patch_size_time 64 \
--patch_size_ch 4 \
--precision "bf16-mixed"
```
See `main_pipleines/main_pretrain.py` for more detailed settings.
### Fine-tuning
Fine-tune the pretrained model on downstream tasks:
```bash
python main_finetune.py \
--model_name "dino_ours" \
--ckpt_path "/path/to/pretrained/checkpoint.ckpt" \
--downstream_dataset_name "shhs" \
--eval_label "Stage" \
--train_data_pct 1.0 \
--max_steps 500 \
--lr 0.1 \
--num_devices 4
```
## π Benchmark Evaluations
### Benchmarked SSL Methods
| Method | Type | Original Paper |
|--------|------|-------------|
| SleepFM | Contrastive | Leave-one-out multi-modal contrastive learning |
| SimCLR | Contrastive | Simple Contrastive Learning |
| DINO | Self-distillation | DINOv2 (Oquab et al., 2023) |
| VQ-VAE | Reconstruction | Vector-quantized variational autoencoder |
| MAE | Reconstruction | Masked Autoencoding |
| AR | Autoregressive | Autoregressive Next-Token prediction |
| OSF | Self-distillation | ours |
### Downstream Tasks
**Epoch-level Classification Tasks:**
| Task | Classes | Description |
|------|---------|-------------|
| Sleep Stage | 4 | Awake, Light Sleep, Deep Sleep, REM classification |
| Arousal | 2 | Arousal event detection |
| Hypopnea | 2 | Hypopnea event detection |
| Oxygen Desaturation | 2 | Oxygen desaturation detection |
### Evaluation Settings
| Setting | Description |
|---------|-------------|
| Linear Probing | Freeze backbone, train linear classifier |
| Full Fine-tuning | Fine-tune entire model end-to-end |
| Few-shot (k-shot) | Train with limited labeled samples |
For example scripts, see `main_pipelines` and `bash_scripts` folders.
## π Supported Datasets
We aggregated nine large-scale datasets from the National Sleep Research Resource platform.
| Dataset | Full Name | Source |
|---------|-----------|--------|
| SHHS | Sleep Heart Health Study | NSRR |
| CHAT | Childhood Adenotonsillectomy Trial | NSRR |
| MROS | MrOS Sleep Study | NSRR |
| CCSHS | Cleveland Children's Sleep and Health Study | NSRR |
| CFS | Cleveland Family Study | NSRR |
| MESA | Multi-Ethnic Study of Atherosclerosis | NSRR |
| SOF | Study of Osteoporotic Fractures | NSRR |
| WSC | Wisconsin Sleep Cohort | NSRR |
| STAGES | Stanford Technology Analytics and Genomics in Sleep | NSRR |
| NCHSDB | NCH Sleep DataBank | NSRR |
For new users, please apply for an account and access to each of these datasets following instructions here [NSRR Registration](https://sleepdata.org/join)
## π Project Structure
```
OSF-Open-Sleep-Foundation-Model/
βββ osf/
β βββ backbone/ # ViT backbone implementations
β β βββ vit1d_cls.py
β βββ models/ # SSL model implementations
β β βββ dino_model_cls.py
β β
β βββ datasets/ # Data loading utilities
β βββ utils/ # Helper functions
βββ main_pipelines/ # Training scripts
β βββ main_pretrain.py
β βββ ...
βββ bash_scripts/ # Example bash scripts
βββ osf_backbone.pth # Pretrained model weights
βββ demo.ipynb # Quick start demo
βββ config.py # Dataset and channel configurations
βββ train_config.py # Training configurations
```
## π Citation
If you use this code or models in your research, please cite our paper:
```bibtex
@article{shuai2026osf,
title={OSF: On Pre-training and Scaling of Sleep Foundation Models},
author={Shuai, Zitao and Xu, Zongzhe and Yang, David and Wang, Wei and Yang, Yuzhe},
journal={arXiv preprint arXiv:2603.00190},
year={2026}
}
```
|