File size: 3,407 Bytes
f9d3bd8 75261a3 f9d3bd8 9d73152 f9d3bd8 9d73152 f9d3bd8 9d73152 f9d3bd8 7a3dacb f9d3bd8 75261a3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | ---
license: mit
tags:
- sleep
- eeg
- ecg
- eog
- emg
- respiratory
- physiological-signals
- foundation-model
- time-series
pipeline_tag: feature-extraction
---
# Hypnos
Hypnos is an autoregressive RQ-Transformer trained via next-token prediction on tokenized
streams of physiological sensor data. It produces general-purpose **1 Hz embeddings** of sleep
physiology for downstream tasks (sleep staging, arousal/event detection, etc.).
This repo holds the released model as a single weight-only `safetensors` bundle: the
RQ-Transformer **and** all 5 tokenizers, plus the config (model + tokenizer construction
kwargs, modality layout) in the file metadata — so loading is fully self-contained.
## Usage
Use the `hypnos` inference library:
```python
from hypnos.embedding import embed_edf
emb = embed_edf("recording.edf") # defaults to this repo
# emb: dict {modality_name: np.ndarray [n_seconds, embed_dim] float16}
# e.g. emb["eeg_c3"], emb["ecg"], ... — one vector per second, per present modality
```
Embeddings are returned **per modality** at the model's native **1 Hz** cadence (one vector
per second). Only modalities present in the recording appear in the dict. For US recordings
pass `notch_freq=60.0` (the default is 50 Hz).
### Pooling
The 1 Hz per-modality output is the raw signal; pool it however your task needs — e.g. a
single embedding per 30-second sleep epoch, averaged over modalities and time:
```python
import numpy as np
emb = embed_edf("recording.edf")
fused = np.mean(list(emb.values()), axis=0) # over modalities -> [T, D]
n = fused.shape[0] // 30
epochs = fused[: n * 30].reshape(n, 30, -1).mean(axis=1) # over 30-s epochs -> [T//30, D]
```
## Modalities
8 modalities sharing 5 tokenizers (K = RVQ levels, all codebooks size 2048, 1 token/sec):
| modality | channel | signal | tokenizer | K | sample rate |
|---|---|---|---|---|---|
| `eeg_c3` | C3 | EEG | eeg-q8 | 8 | 128 Hz |
| `eeg_c4` | C4 | EEG | eeg-q8 | 8 | 128 Hz |
| `eog_e1` | E1 | EOG | eog-q8 | 8 | 128 Hz |
| `eog_e2` | E2 | EOG | eog-q8 | 8 | 128 Hz |
| `emg_chin` | Chin | EMG | emg-q8 | 8 | 128 Hz |
| `ecg` | ECG | ECG | ecg-q4 | 4 | 128 Hz |
| `resp_abd` | ABD | respiratory | resp-q4 | 4 | 32 Hz |
| `resp_thx` | THX | respiratory | resp-q4 | 4 | 32 Hz |
EEG/EOG channels are contralaterally referenced (e.g. C3-M2); Chin EMG is a bipolar
derivation; ECG/respiratory are used directly. Embedding dimension is 768.
## Devices
CUDA and CPU are supported. **Apple Silicon (MPS) is not** — PyTorch's `flex_attention` has no
Metal kernel, so on a Mac use `device="cpu"` (a 2-minute clip embeds in a few seconds; a full
night takes ~1 minute).
## Format
`hypnos.safetensors` — weights under namespaced keys (`model/…`, `tok/<name>/…`)
with the config as a JSON string in the file metadata. Loaded with `safetensors` (no
arbitrary-code unpickling).
## License
Released under the [MIT License](LICENSE).
## Citation
```bibtex
@online{carterNextTokenPredictionLearns2026,
title = {Next-Token Prediction Learns Generalisable Representations of Sleep Physiology},
author = {Carter, Jonathan F. and Tarassenko, Lionel},
date = {2026-06-08},
eprint = {2606.09605},
eprinttype = {arXiv},
eprintclass = {cs.AI},
doi = {10.48550/arXiv.2606.09605},
url = {http://arxiv.org/abs/2606.09605},
}
```
|