Hypnos

Hypnos is an autoregressive RQ-Transformer trained via next-token prediction on tokenized streams of physiological sensor data. It produces general-purpose 1 Hz embeddings of sleep physiology for downstream tasks (sleep staging, arousal/event detection, etc.).

This repo holds the released model as a single weight-only safetensors bundle: the RQ-Transformer and all 5 tokenizers, plus the config (model + tokenizer construction kwargs, modality layout) in the file metadata — so loading is fully self-contained.

Usage

Use the hypnos inference library:

from hypnos.embedding import embed_edf

emb = embed_edf("recording.edf")   # defaults to this repo
# emb: dict {modality_name: np.ndarray [n_seconds, embed_dim] float16}
#   e.g. emb["eeg_c3"], emb["ecg"], ... — one vector per second, per present modality

Embeddings are returned per modality at the model's native 1 Hz cadence (one vector per second). Only modalities present in the recording appear in the dict. For US recordings pass notch_freq=60.0 (the default is 50 Hz).

Pooling

The 1 Hz per-modality output is the raw signal; pool it however your task needs — e.g. a single embedding per 30-second sleep epoch, averaged over modalities and time:

import numpy as np

emb = embed_edf("recording.edf")
fused = np.mean(list(emb.values()), axis=0)              # over modalities -> [T, D]
n = fused.shape[0] // 30
epochs = fused[: n * 30].reshape(n, 30, -1).mean(axis=1)  # over 30-s epochs -> [T//30, D]

Modalities

8 modalities sharing 5 tokenizers (K = RVQ levels, all codebooks size 2048, 1 token/sec):

modality channel signal tokenizer K sample rate
eeg_c3 C3 EEG eeg-q8 8 128 Hz
eeg_c4 C4 EEG eeg-q8 8 128 Hz
eog_e1 E1 EOG eog-q8 8 128 Hz
eog_e2 E2 EOG eog-q8 8 128 Hz
emg_chin Chin EMG emg-q8 8 128 Hz
ecg ECG ECG ecg-q4 4 128 Hz
resp_abd ABD respiratory resp-q4 4 32 Hz
resp_thx THX respiratory resp-q4 4 32 Hz

EEG/EOG channels are contralaterally referenced (e.g. C3-M2); Chin EMG is a bipolar derivation; ECG/respiratory are used directly. Embedding dimension is 768.

Devices

CUDA and CPU are supported. Apple Silicon (MPS) is not — PyTorch's flex_attention has no Metal kernel, so on a Mac use device="cpu" (a 2-minute clip embeds in a few seconds; a full night takes ~1 minute).

Format

hypnos.safetensors — weights under namespaced keys (model/…, tok/<name>/…) with the config as a JSON string in the file metadata. Loaded with safetensors (no arbitrary-code unpickling).

License

Released under the MIT License.

Citation

@online{carterNextTokenPredictionLearns2026,
  title       = {Next-Token Prediction Learns Generalisable Representations of Sleep Physiology},
  author      = {Carter, Jonathan F. and Tarassenko, Lionel},
  date        = {2026-06-08},
  eprint      = {2606.09605},
  eprinttype  = {arXiv},
  eprintclass = {cs.AI},
  doi         = {10.48550/arXiv.2606.09605},
  url         = {http://arxiv.org/abs/2606.09605},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for joncarter/hypnos