Hypnos
Hypnos is an autoregressive RQ-Transformer trained via next-token prediction on tokenized streams of physiological sensor data. It produces general-purpose 1 Hz embeddings of sleep physiology for downstream tasks (sleep staging, arousal/event detection, etc.).
This repo holds the released model as a single weight-only safetensors bundle: the
RQ-Transformer and all 5 tokenizers, plus the config (model + tokenizer construction
kwargs, modality layout) in the file metadata — so loading is fully self-contained.
Usage
Use the hypnos inference library:
from hypnos.embedding import embed_edf
emb = embed_edf("recording.edf") # defaults to this repo
# emb: dict {modality_name: np.ndarray [n_seconds, embed_dim] float16}
# e.g. emb["eeg_c3"], emb["ecg"], ... — one vector per second, per present modality
Embeddings are returned per modality at the model's native 1 Hz cadence (one vector
per second). Only modalities present in the recording appear in the dict. For US recordings
pass notch_freq=60.0 (the default is 50 Hz).
Pooling
The 1 Hz per-modality output is the raw signal; pool it however your task needs — e.g. a single embedding per 30-second sleep epoch, averaged over modalities and time:
import numpy as np
emb = embed_edf("recording.edf")
fused = np.mean(list(emb.values()), axis=0) # over modalities -> [T, D]
n = fused.shape[0] // 30
epochs = fused[: n * 30].reshape(n, 30, -1).mean(axis=1) # over 30-s epochs -> [T//30, D]
Modalities
8 modalities sharing 5 tokenizers (K = RVQ levels, all codebooks size 2048, 1 token/sec):
| modality | channel | signal | tokenizer | K | sample rate |
|---|---|---|---|---|---|
eeg_c3 |
C3 | EEG | eeg-q8 | 8 | 128 Hz |
eeg_c4 |
C4 | EEG | eeg-q8 | 8 | 128 Hz |
eog_e1 |
E1 | EOG | eog-q8 | 8 | 128 Hz |
eog_e2 |
E2 | EOG | eog-q8 | 8 | 128 Hz |
emg_chin |
Chin | EMG | emg-q8 | 8 | 128 Hz |
ecg |
ECG | ECG | ecg-q4 | 4 | 128 Hz |
resp_abd |
ABD | respiratory | resp-q4 | 4 | 32 Hz |
resp_thx |
THX | respiratory | resp-q4 | 4 | 32 Hz |
EEG/EOG channels are contralaterally referenced (e.g. C3-M2); Chin EMG is a bipolar derivation; ECG/respiratory are used directly. Embedding dimension is 768.
Devices
CUDA and CPU are supported. Apple Silicon (MPS) is not — PyTorch's flex_attention has no
Metal kernel, so on a Mac use device="cpu" (a 2-minute clip embeds in a few seconds; a full
night takes ~1 minute).
Format
hypnos.safetensors — weights under namespaced keys (model/…, tok/<name>/…)
with the config as a JSON string in the file metadata. Loaded with safetensors (no
arbitrary-code unpickling).
License
Released under the MIT License.
Citation
@online{carterNextTokenPredictionLearns2026,
title = {Next-Token Prediction Learns Generalisable Representations of Sleep Physiology},
author = {Carter, Jonathan F. and Tarassenko, Lionel},
date = {2026-06-08},
eprint = {2606.09605},
eprinttype = {arXiv},
eprintclass = {cs.AI},
doi = {10.48550/arXiv.2606.09605},
url = {http://arxiv.org/abs/2606.09605},
}