--- license: mit tags: - sleep - eeg - ecg - eog - emg - respiratory - physiological-signals - foundation-model - time-series pipeline_tag: feature-extraction --- # Hypnos Hypnos is an autoregressive RQ-Transformer trained via next-token prediction on tokenized streams of physiological sensor data. It produces general-purpose **1 Hz embeddings** of sleep physiology for downstream tasks (sleep staging, arousal/event detection, etc.). This repo holds the released model as a single weight-only `safetensors` bundle: the RQ-Transformer **and** all 5 tokenizers, plus the config (model + tokenizer construction kwargs, modality layout) in the file metadata — so loading is fully self-contained. ## Usage Use the `hypnos` inference library: ```python from hypnos.embedding import embed_edf emb = embed_edf("recording.edf") # defaults to this repo # emb: dict {modality_name: np.ndarray [n_seconds, embed_dim] float16} # e.g. emb["eeg_c3"], emb["ecg"], ... — one vector per second, per present modality ``` Embeddings are returned **per modality** at the model's native **1 Hz** cadence (one vector per second). Only modalities present in the recording appear in the dict. For US recordings pass `notch_freq=60.0` (the default is 50 Hz). ### Pooling The 1 Hz per-modality output is the raw signal; pool it however your task needs — e.g. a single embedding per 30-second sleep epoch, averaged over modalities and time: ```python import numpy as np emb = embed_edf("recording.edf") fused = np.mean(list(emb.values()), axis=0) # over modalities -> [T, D] n = fused.shape[0] // 30 epochs = fused[: n * 30].reshape(n, 30, -1).mean(axis=1) # over 30-s epochs -> [T//30, D] ``` ## Modalities 8 modalities sharing 5 tokenizers (K = RVQ levels, all codebooks size 2048, 1 token/sec): | modality | channel | signal | tokenizer | K | sample rate | |---|---|---|---|---|---| | `eeg_c3` | C3 | EEG | eeg-q8 | 8 | 128 Hz | | `eeg_c4` | C4 | EEG | eeg-q8 | 8 | 128 Hz | | `eog_e1` | E1 | EOG | eog-q8 | 8 | 128 Hz | | `eog_e2` | E2 | EOG | eog-q8 | 8 | 128 Hz | | `emg_chin` | Chin | EMG | emg-q8 | 8 | 128 Hz | | `ecg` | ECG | ECG | ecg-q4 | 4 | 128 Hz | | `resp_abd` | ABD | respiratory | resp-q4 | 4 | 32 Hz | | `resp_thx` | THX | respiratory | resp-q4 | 4 | 32 Hz | EEG/EOG channels are contralaterally referenced (e.g. C3-M2); Chin EMG is a bipolar derivation; ECG/respiratory are used directly. Embedding dimension is 768. ## Devices CUDA and CPU are supported. **Apple Silicon (MPS) is not** — PyTorch's `flex_attention` has no Metal kernel, so on a Mac use `device="cpu"` (a 2-minute clip embeds in a few seconds; a full night takes ~1 minute). ## Format `hypnos.safetensors` — weights under namespaced keys (`model/…`, `tok//…`) with the config as a JSON string in the file metadata. Loaded with `safetensors` (no arbitrary-code unpickling). ## License Released under the [MIT License](LICENSE). ## Citation ```bibtex @online{carterNextTokenPredictionLearns2026, title = {Next-Token Prediction Learns Generalisable Representations of Sleep Physiology}, author = {Carter, Jonathan F. and Tarassenko, Lionel}, date = {2026-06-08}, eprint = {2606.09605}, eprinttype = {arXiv}, eprintclass = {cs.AI}, doi = {10.48550/arXiv.2606.09605}, url = {http://arxiv.org/abs/2606.09605}, } ```