| --- |
| license: mit |
| tags: |
| - sleep |
| - eeg |
| - ecg |
| - eog |
| - emg |
| - respiratory |
| - physiological-signals |
| - foundation-model |
| - time-series |
| pipeline_tag: feature-extraction |
| --- |
| |
| # Hypnos |
|
|
| Hypnos is an autoregressive RQ-Transformer trained via next-token prediction on tokenized |
| streams of physiological sensor data. It produces general-purpose **1 Hz embeddings** of sleep |
| physiology for downstream tasks (sleep staging, arousal/event detection, etc.). |
|
|
| This repo holds the released model as a single weight-only `safetensors` bundle: the |
| RQ-Transformer **and** all 5 tokenizers, plus the config (model + tokenizer construction |
| kwargs, modality layout) in the file metadata β so loading is fully self-contained. |
|
|
| ## Usage |
|
|
| Use the `hypnos` inference library: |
|
|
| ```python |
| from hypnos.embedding import embed_edf |
| |
| emb = embed_edf("recording.edf") # defaults to this repo |
| # emb: dict {modality_name: np.ndarray [n_seconds, embed_dim] float16} |
| # e.g. emb["eeg_c3"], emb["ecg"], ... β one vector per second, per present modality |
| ``` |
|
|
| Embeddings are returned **per modality** at the model's native **1 Hz** cadence (one vector |
| per second). Only modalities present in the recording appear in the dict. For US recordings |
| pass `notch_freq=60.0` (the default is 50 Hz). |
|
|
| ### Pooling |
|
|
| The 1 Hz per-modality output is the raw signal; pool it however your task needs β e.g. a |
| single embedding per 30-second sleep epoch, averaged over modalities and time: |
|
|
| ```python |
| import numpy as np |
| |
| emb = embed_edf("recording.edf") |
| fused = np.mean(list(emb.values()), axis=0) # over modalities -> [T, D] |
| n = fused.shape[0] // 30 |
| epochs = fused[: n * 30].reshape(n, 30, -1).mean(axis=1) # over 30-s epochs -> [T//30, D] |
| ``` |
|
|
| ## Modalities |
|
|
| 8 modalities sharing 5 tokenizers (K = RVQ levels, all codebooks size 2048, 1 token/sec): |
|
|
| | modality | channel | signal | tokenizer | K | sample rate | |
| |---|---|---|---|---|---| |
| | `eeg_c3` | C3 | EEG | eeg-q8 | 8 | 128 Hz | |
| | `eeg_c4` | C4 | EEG | eeg-q8 | 8 | 128 Hz | |
| | `eog_e1` | E1 | EOG | eog-q8 | 8 | 128 Hz | |
| | `eog_e2` | E2 | EOG | eog-q8 | 8 | 128 Hz | |
| | `emg_chin` | Chin | EMG | emg-q8 | 8 | 128 Hz | |
| | `ecg` | ECG | ECG | ecg-q4 | 4 | 128 Hz | |
| | `resp_abd` | ABD | respiratory | resp-q4 | 4 | 32 Hz | |
| | `resp_thx` | THX | respiratory | resp-q4 | 4 | 32 Hz | |
|
|
| EEG/EOG channels are contralaterally referenced (e.g. C3-M2); Chin EMG is a bipolar |
| derivation; ECG/respiratory are used directly. Embedding dimension is 768. |
|
|
| ## Devices |
|
|
| CUDA and CPU are supported. **Apple Silicon (MPS) is not** β PyTorch's `flex_attention` has no |
| Metal kernel, so on a Mac use `device="cpu"` (a 2-minute clip embeds in a few seconds; a full |
| night takes ~1 minute). |
|
|
| ## Format |
|
|
| `hypnos.safetensors` β weights under namespaced keys (`model/β¦`, `tok/<name>/β¦`) |
| with the config as a JSON string in the file metadata. Loaded with `safetensors` (no |
| arbitrary-code unpickling). |
|
|
| ## License |
|
|
| Released under the [MIT License](LICENSE). |
|
|
| ## Citation |
|
|
| ```bibtex |
| @online{carterNextTokenPredictionLearns2026, |
| title = {Next-Token Prediction Learns Generalisable Representations of Sleep Physiology}, |
| author = {Carter, Jonathan F. and Tarassenko, Lionel}, |
| date = {2026-06-08}, |
| eprint = {2606.09605}, |
| eprinttype = {arXiv}, |
| eprintclass = {cs.AI}, |
| doi = {10.48550/arXiv.2606.09605}, |
| url = {http://arxiv.org/abs/2606.09605}, |
| } |
| ``` |
|
|