braindecode

deprecated

Model card Files Files and versions

xet

Community

PierreGtch commited on 29 days ago

Commit

71c8b43

verified ·

1 Parent(s): 7b53b04

Update README.md

Browse files

Files changed (1) hide show

README.md +114 -86

README.md CHANGED Viewed

@@ -1,40 +1,79 @@
 ---
 license: mit
 ---
 ![sjepa](https://cdn-uploads.huggingface.co/production/uploads/646e0135174cc96d509582a6/DS-cXrFyxZ78hK48ft0iU.png)
-# Usage
-**Instantiate the Base Model**
 ```python
 from braindecode.models import SignalJEPA
-from huggingface_hub import hf_hub_download
-weights_path = hf_hub_download(repo_id="braindecode/SignalJEPA", filename="signal-jepa_16s-60_adeuwv4s.pth")
-model_state_dict = torch.load(weights_path)
-# Signal-related arguments
-# raw: mne.io.BaseRaw
-chs_info = raw.info["chs"]
-sfreq = raw.info["sfreq"]
-model = SignalJEPA(
-    sfreq=sfreq,
-    input_window_seconds=2,
-    chs_info=chs_info,
 )
-missing_keys, unexpected_keys = model.load_state_dict(model_state_dict, strict=False)
-assert unexpected_keys == []
-# The spatial positional encoder is initialized using the `chs_info`:
-assert set(missing_keys) == {"pos_encoder.pos_encoder_spat.weight"}
 ```
-**Instantiate the Downstream Architectures**
-Contrary to the base model, the downstream architectures are equipped with a classification head which is not pre-trained.
-Guetschel et al. (2024) [arXiv:2403.11772](https://arxiv.org/abs/2403.11772) introduce three downstream architectures:
-- a) Contextual downstream architecture
-- b) Post-local downstream architecture
-- c) Pre-local architecture
 ```python
 from braindecode.models import (
@@ -42,72 +81,61 @@ from braindecode.models import (
     SignalJEPA_PreLocal,
     SignalJEPA_PostLocal,
 )
-from huggingface_hub import hf_hub_download
-weights_path = hf_hub_download(repo_id="braindecode/SignalJEPA", filename="signal-jepa_16s-60_adeuwv4s.pth")
-model_state_dict = torch.load(weights_path)
-# Signal-related arguments
-# raw: mne.io.BaseRaw
-chs_info = raw.info["chs"]
-sfreq = raw.info["sfreq"]
-# The downstream architectures are equipped with an additional classification head
-# which was not pre-trained. It has the following new parameters:
-final_layer_keys = {
-    "final_layer.spat_conv.weight",
-    "final_layer.spat_conv.bias",
-    "final_layer.linear.weight",
-    "final_layer.linear.bias",
-}
-# a) Contextual downstream architecture
-#    ----------------------------------
-model = SignalJEPA_Contextual(
-    sfreq=sfreq,
-    input_window_seconds=2,
-    chs_info=chs_info,
-    n_outputs=1,
 )
-missing_keys, unexpected_keys = model.load_state_dict(model_state_dict, strict=False)
-assert unexpected_keys == []
-# The spatial positional encoder is initialized using the `chs_info`:
-assert set(missing_keys) == final_layer_keys | {"pos_encoder.pos_encoder_spat.weight"}
-# In the post-local (b) and pre-local (c) architectures, the transformer is discarded:
-FILTERED_model_state_dict = {
-    k: v for k, v in model_state_dict.items() if not any(k.startswith(pre) for pre in ["transformer.", "pos_encoder."])
-}
-# b) Post-local downstream architecture
-#    ----------------------------------
-model = SignalJEPA_PostLocal(
-    sfreq=sfreq,
-    input_window_seconds=2,
-    n_chans=len(chs_info),  # detailed channel info is not needed for this model
-    n_outputs=1,
 )
-missing_keys, unexpected_keys = model.load_state_dict(FILTERED_model_state_dict, strict=False)
-assert unexpected_keys == []
-assert set(missing_keys) == final_layer_keys
-# c) Pre-local architecture
-#    ----------------------
-model = SignalJEPA_PreLocal(
-    sfreq=sfreq,
-    input_window_seconds=2,
-    n_chans=len(chs_info),  # detailed channel info is not needed for this model
-    n_outputs=1,
 )
-missing_keys, unexpected_keys = model.load_state_dict(FILTERED_model_state_dict, strict=False)
-assert unexpected_keys == []
-assert set(missing_keys) == {
-    "spatial_conv.1.weight",
-    "spatial_conv.1.bias",
-    "final_layer.1.weight",
-    "final_layer.1.bias",
 }
-```

 ---
 license: mit
+library_name: braindecode
+tags:
+  - eeg
+  - foundation-model
+  - self-supervised
+  - signal-jepa
+pipeline_tag: feature-extraction
 ---
 ![sjepa](https://cdn-uploads.huggingface.co/production/uploads/646e0135174cc96d509582a6/DS-cXrFyxZ78hK48ft0iU.png)
+# Signal-JEPA
+Self-supervised pre-trained weights for the Signal-JEPA foundation model from
+[Guetschel et al. (2024)](https://arxiv.org/abs/2403.11772), packaged for use
+with [braindecode](https://braindecode.org/).
+The model was pre-trained on the Lee2019 dataset (62 EEG channels in the
+10-10 layout, sampled at 128 Hz). The repo ships the weights together with a
+`config.json` so they can be loaded in one line with
+`YourModelClass.from_pretrained(repo_id, ...)`.
+## Available checkpoints
+Two variants are published:
+| repo ID | channel embedding included | when to use |
+| --- | --- | --- |
+| [`braindecode/signal-jepa`](https://huggingface.co/braindecode/signal-jepa) | ✓ 62-row `_ChannelEmbedding` aligned with the pre-training layout | your recording channels are a **subset** (by name, case-insensitive) of the 62 pre-training channels — you want to reuse the learned spatial embeddings |
+| [`braindecode/signal-jepa_without-chans`](https://huggingface.co/braindecode/signal-jepa_without-chans) | ✗ only the SSL backbone (feature encoder + transformer) | your channels are **not** a subset of the pre-training set, or you prefer to train channel embeddings from scratch |
+If you are unsure, start with `braindecode/signal-jepa_without-chans`: it
+always works, regardless of your electrode layout.
+## Quick start
+### Base model (pre-training architecture)
+The base model outputs contextual features, not class predictions. Use it
+for downstream feature extraction or further SSL.
 ```python
 from braindecode.models import SignalJEPA
+# With the pre-trained channel embeddings (recording channels ⊂ pre-train set):
+model = SignalJEPA.from_pretrained("braindecode/signal-jepa")
+# Or: with your own channels, kept aligned to the pre-training embedding table
+model = SignalJEPA.from_pretrained(
+    "braindecode/signal-jepa",
+    chs_info=raw.info["chs"],           # subset of the 62 pre-training channels
+    channel_embedding="pretrain_aligned",
+)
+# Or: without pre-trained channel embeddings (any electrode layout):
+model = SignalJEPA.from_pretrained(
+    "braindecode/signal-jepa_without-chans",
+    chs_info=raw.info["chs"],
+    strict=False,  # the channel-embedding weight is intentionally missing
 )
 ```
+### Downstream architectures
+Three classification architectures are introduced in the paper:
+- **a) Contextual** — uses the full transformer encoder
+- **b) Post-local** — discards the transformer; spatial convolution after local features
+- **c) Pre-local** — discards the transformer; spatial convolution before local features
+All three add a freshly-initialized classification head on top of the SSL
+backbone. The head is **not** part of the checkpoint and will be trained from
+scratch during fine-tuning; pass `strict=False` so `from_pretrained` does not
+complain about those missing keys.
 ```python
 from braindecode.models import (
     SignalJEPA_PreLocal,
     SignalJEPA_PostLocal,
 )
+# a) Contextual — keeps the transformer
+model = SignalJEPA_Contextual.from_pretrained(
+    "braindecode/signal-jepa",          # or "signal-jepa_without-chans"
+    n_times=256,                         # e.g. 2 s at 128 Hz
+    n_outputs=4,
+    strict=False,                        # ignore un-trained classification head
 )
+# b) Post-local — transformer discarded
+model = SignalJEPA_PostLocal.from_pretrained(
+    "braindecode/signal-jepa_without-chans",
+    n_chans=19,
+    n_times=256,
+    n_outputs=4,
+    strict=False,
 )
+# c) Pre-local — transformer discarded
+model = SignalJEPA_PreLocal.from_pretrained(
+    "braindecode/signal-jepa_without-chans",
+    n_chans=19,
+    n_times=256,
+    n_outputs=4,
+    strict=False,
 )
+```
+See the braindecode tutorial
+[Fine-tuning a Foundation Model (Signal-JEPA)](https://braindecode.org/stable/auto_examples/advanced_training/plot_finetune_foundation_model.html)
+for a complete example including layer freezing and training with
+`skorch.EEGClassifier`.
+## Channel embedding modes
+`SignalJEPA` and `SignalJEPA_Contextual` accept a `channel_embedding` kwarg:
+- `"scratch"` (default): the `_ChannelEmbedding` table has one row per user
+  channel, initialized from `chs_info`. Compatible with the
+  `without-chans` checkpoint.
+- `"pretrain_aligned"`: the table has 62 rows in the pre-training order,
+  `forward` indexes into the subset matching your `chs_info` (matched by
+  channel name, case-insensitive). Compatible with the full checkpoint.
+`from_pretrained` picks the right mode automatically based on the checkpoint's
+`config.json`; override with the `channel_embedding=` kwarg if needed.
+## Citation
+```bibtex
+@article{guetschel2024sjepa,
+  title   = {S-JEPA: towards seamless cross-dataset transfer
+             through dynamic spatial attention},
+  author  = {Guetschel, Pierre and Moreau, Thomas and Tangermann, Michael},
+  journal = {arXiv preprint arXiv:2403.11772},
+  year    = {2024},
 }
+```