joncarter
/

hypnos

Feature Extraction

physiological-signals

foundation-model

Model card Files Files and versions

hypnos / README.md

joncarter's picture

Update bundle filename in card

7a3dacb verified 1 day ago

|

history blame contribute delete

3.41 kB

	---
	license: mit
	tags:
	- sleep
	- eeg
	- ecg
	- eog
	- emg
	- respiratory
	- physiological-signals
	- foundation-model
	- time-series
	pipeline_tag: feature-extraction
	---

	# Hypnos

	Hypnos is an autoregressive RQ-Transformer trained via next-token prediction on tokenized
	streams of physiological sensor data. It produces general-purpose 1 Hz embeddings of sleep
	physiology for downstream tasks (sleep staging, arousal/event detection, etc.).

	This repo holds the released model as a single weight-only `safetensors` bundle: the
	RQ-Transformer and all 5 tokenizers, plus the config (model + tokenizer construction
	kwargs, modality layout) in the file metadata — so loading is fully self-contained.

	## Usage

	Use the `hypnos` inference library:

	```python
	from hypnos.embedding import embed_edf

	emb = embed_edf("recording.edf") # defaults to this repo
	# emb: dict {modality_name: np.ndarray [n_seconds, embed_dim] float16}
	# e.g. emb["eeg_c3"], emb["ecg"], ... — one vector per second, per present modality
	```

	Embeddings are returned per modality at the model's native 1 Hz cadence (one vector
	per second). Only modalities present in the recording appear in the dict. For US recordings
	pass `notch_freq=60.0` (the default is 50 Hz).

	### Pooling

	The 1 Hz per-modality output is the raw signal; pool it however your task needs — e.g. a
	single embedding per 30-second sleep epoch, averaged over modalities and time:

	```python
	import numpy as np

	emb = embed_edf("recording.edf")
	fused = np.mean(list(emb.values()), axis=0) # over modalities -> [T, D]
	n = fused.shape[0] // 30
	epochs = fused[: n * 30].reshape(n, 30, -1).mean(axis=1) # over 30-s epochs -> [T//30, D]
	```

	## Modalities

	8 modalities sharing 5 tokenizers (K = RVQ levels, all codebooks size 2048, 1 token/sec):

	\| modality \| channel \| signal \| tokenizer \| K \| sample rate \|
	\|---\|---\|---\|---\|---\|---\|
	\| `eeg_c3` \| C3 \| EEG \| eeg-q8 \| 8 \| 128 Hz \|
	\| `eeg_c4` \| C4 \| EEG \| eeg-q8 \| 8 \| 128 Hz \|
	\| `eog_e1` \| E1 \| EOG \| eog-q8 \| 8 \| 128 Hz \|
	\| `eog_e2` \| E2 \| EOG \| eog-q8 \| 8 \| 128 Hz \|
	\| `emg_chin` \| Chin \| EMG \| emg-q8 \| 8 \| 128 Hz \|
	\| `ecg` \| ECG \| ECG \| ecg-q4 \| 4 \| 128 Hz \|
	\| `resp_abd` \| ABD \| respiratory \| resp-q4 \| 4 \| 32 Hz \|
	\| `resp_thx` \| THX \| respiratory \| resp-q4 \| 4 \| 32 Hz \|

	EEG/EOG channels are contralaterally referenced (e.g. C3-M2); Chin EMG is a bipolar
	derivation; ECG/respiratory are used directly. Embedding dimension is 768.

	## Devices

	CUDA and CPU are supported. Apple Silicon (MPS) is not — PyTorch's `flex_attention` has no
	Metal kernel, so on a Mac use `device="cpu"` (a 2-minute clip embeds in a few seconds; a full
	night takes ~1 minute).

	## Format

	`hypnos.safetensors` — weights under namespaced keys (`model/…`, `tok/<name>/…`)
	with the config as a JSON string in the file metadata. Loaded with `safetensors` (no
	arbitrary-code unpickling).

	## License

	Released under the [MIT License](LICENSE).

	## Citation

	```bibtex
	@online{carterNextTokenPredictionLearns2026,
	title = {Next-Token Prediction Learns Generalisable Representations of Sleep Physiology},
	author = {Carter, Jonathan F. and Tarassenko, Lionel},
	date = {2026-06-08},
	eprint = {2606.09605},
	eprinttype = {arXiv},
	eprintclass = {cs.AI},
	doi = {10.48550/arXiv.2606.09605},
	url = {http://arxiv.org/abs/2606.09605},
	}
	```