| --- |
| license: bsd-3-clause |
| library_name: braindecode |
| pipeline_tag: feature-extraction |
| tags: |
| - eeg |
| - biosignal |
| - pytorch |
| - neuroscience |
| - braindecode |
| - convolutional |
| --- |
| |
| # BrainModule |
|
|
| BrainModule from , also known as SimpleConv. |
|
|
| > **Architecture-only repository.** This repo documents the |
| > `braindecode.models.BrainModule` class. **No pretrained weights are |
| > distributed here** — instantiate the model and train it on your own |
| > data, or fine-tune from a published foundation-model checkpoint |
| > separately. |
|
|
| ## Quick start |
|
|
| ```bash |
| pip install braindecode |
| ``` |
|
|
| ```python |
| from braindecode.models import BrainModule |
| |
| model = BrainModule( |
| n_chans=22, |
| sfreq=250, |
| input_window_seconds=4.0, |
| n_outputs=4, |
| ) |
| ``` |
|
|
| The signal-shape arguments above are example defaults — adjust them |
| to match your recording. |
|
|
| ## Documentation |
|
|
| - Full API reference (parameters, references, architecture figure): |
| <https://braindecode.org/stable/generated/braindecode.models.BrainModule.html> |
| - Interactive browser with live instantiation: |
| <https://huggingface.co/spaces/braindecode/model-explorer> |
| - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/brainmodule.py#L25> |
|
|
| ## Architecture description |
|
|
| The block below is the rendered class docstring (parameters, |
| references, architecture figure where available). |
|
|
| <div class='bd-doc'><main> |
| <p>BrainModule from [brainmagick]_, also known as SimpleConv.</p> |
| <blockquote> |
| <p>A dilated convolutional encoder for EEG decoding, using residual |
| connections and optional GLU gating for improved expressivity.</p> |
| </blockquote> |
| <span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#5cb85c;color:white;font-size:11px;font-weight:600;margin-right:4px;">Convolution</span> |
| |
| |
| |
| .. figure:: ../_static/model/simpleconv.png |
| :align: center |
| :alt: BrainModule Architecture |
| :width: 500px |
| |
| Figure adapted Extended Data Fig. 4 from [brainmagick]_ to highlight only the model part. |
| Architecture of the brain module. Architecture used to process the brain recordings. |
| For each layer, the authors note first the number of output channels, while the number of time steps |
| is constant throughout the layers. The model is composed of a spatial attention layer, |
| then a 1x1 convolution without activation. A 'Subject Layer' is selected based on the subject index s, |
| which consists in a 1x1 convolution learnt only for that subject with no activation. Then, |
| the authors apply five convolutional blocks made of three convolutions. The first |
| two use residual skip connection and increasing dilation, followed by a BatchNorm layer and a |
| GELU activation. The third convolution is not residual, and uses a GLU activation |
| (which halves the number of channels) and no normalization. |
| Finally, the authors apply two 1x1 convolutions with a GELU in between. |
| |
| The BrainModule (also referred to as SimpleConv) is a deep dilated |
| convolutional encoder specifically designed to decode perceived speech from |
| non-invasive brain recordings like EEG and MEG. It is engineered to address |
| the high noise levels and inter-individual variability inherent in |
| non-invasive neuroimaging by using a single architecture trained across |
| large cohorts while accommodating participant-specific differences. |
|
|
| .. rubric:: Architecture Overview |
|
|
| The BrainModule integrates three primary mechanisms to align brain activity |
| with deep speech representations: |
|
|
| 1. **Spatial-temporal feature extraction.** The model uses a dedicated |
| spatial attention layer to remap sensor data based on physical |
| locations, followed by temporal processing through dilated convolutions. |
| 2. **Subject-specific adaptation.** To leverage inter-subject variability, |
| the architecture includes a "Subject Layer" or participant-specific |
| 1x1 convolution that allows the model to share core weights across a |
| cohort while learning individual-specific neural patterns. |
| 3. **Dilated residual blocks with gating.** The core encoder employs a |
| stack of convolutional blocks featuring skip connections and increasing |
| dilation to expand the receptive field without losing temporal |
| resolution, supplemented by optional Gated Linear Units (GLU) for |
| increased expressivity. |
| |
| .. rubric:: Macro Components |
|
|
| ``BrainModule.input_projection`` (Initial Processing) |
| **Operations.** Raw M/EEG input |
| :math:`\mathbf{X} \in \mathbb{R}^{C \times T}` is first processed |
| through a spatial attention layer that projects sensor locations onto a |
| 2D plane using Fourier-parameterized functions. This is followed by a |
| subject-specific 1x1 convolution |
| :math:`\mathbf{M}_s \in \mathbb{R}^{D_1 \times D_1}` if subject |
| features are enabled. The resulting features are projected to the |
| ``hidden_dim`` (default 320) to ensure compatibility with subsequent |
| residual connections. |
| |
| **Role.** Converts high-dimensional, subject-dependent sensor data into |
| a standardized latent space while preserving spatial and temporal |
| relationships. |
| |
| ``BrainModule.encoder`` (Convolutional Sequence) |
| **Operations.** Implemented via |
| :class:`~braindecode.models.brainmodule._ConvSequence`, this component |
| consists of a stack of ``k`` convolutional blocks. Each block typically |
| contains: (a) **Residual dilated convolutions.** Two layers with kernel |
| size 3, residual skip connections, and dilation factors that grow |
| exponentially (e.g., powers of two with periodic resets) to capture |
| multi-scale temporal context. (b) **GLU gating.** Every ``N`` layers |
| (defined by ``glu``), a Gated Linear Unit is applied, which halves the |
| channel dimension and introduces non-linear gating to filter |
| intermediate representations. |
| |
| **Role.** Extracts deep hierarchical temporal features from the brain |
| signal, significantly expanding the model's receptive field to align |
| with the contextual windows of speech modules like wav2vec 2.0. |
| |
| .. rubric:: Temporal, Spatial, and Spectral Encoding |
|
|
| - **Temporal:** Increasing dilation factors across layers allow the model to |
| integrate information over large time windows without the computational |
| cost of standard large kernels, while a 150 ms input shift facilitates |
| alignment between stimulus and brain response. |
| - **Spatial:** The spatial attention layer learns a softmax weighting over |
| input sensors based on their 3D coordinates, allowing the model to focus |
| on regions typically activated during auditory stimulation (e.g., the |
| temporal cortex). |
| - **Spectral:** Through the optional ``n_fft`` parameter, the model can |
| apply an STFT transformation, converting time-domain signals into a |
| spectrogram representation before encoding. |
|
|
| .. rubric:: Additional Mechanisms |
|
|
| - **Clamping and scaling:** The model relies on clamping input values |
| (e.g., at 20 standard deviations) to prevent outliers and large |
| electromagnetic artifacts from destabilizing the BatchNorm estimates and |
| optimization process. |
| - **Scaled subject embeddings:** When ``subject_dim`` is used, the |
| :class:`~braindecode.models.brainmodule._ScaledEmbedding` layer scales up |
| the learning rate for subject-specific features to prevent slow |
| convergence in multi-participant training. |
|
|
|
|
| - **_ConvSequence and residual logic:** This class handles the actual |
| stacking of layers. It is designed to be flexible with the ``growth`` |
| parameter; if the channel size changes between layers (``growth != 1.0``), |
| it automatically applies a 1x1 ``skip_projection`` convolution to the |
| residual path so dimensions match for addition. |
| - **_ChannelDropout:** Unlike standard dropout which zeroes individual |
| neurons, this zeroes entire channels. It includes a rescale feature that |
| multiplies the remaining channels by a factor |
| ``total_channels / active_channels`` to maintain the expected value of the |
| signal during training. |
| - **_ScaledEmbedding:** This is a clever optimization for multi-subject |
| learning. By dividing the initial weights by a scale and then multiplying |
| the output by the same scale, it effectively increases the gradient |
| magnitude for the embedding weights, allowing subject-specific features to |
| learn faster than the shared backbone. |
| |
| |
| Parameters |
| ---------- |
| hidden_dim : int, default=320 |
| Hidden dimension for convolutional layers. Input is projected to this |
| dimension before the convolutional blocks. |
| depth : int, default=10 |
| Number of convolutional blocks. Each block contains a dilated convolution |
| with batch normalization and activation, followed by a residual connection. |
| kernel_size : int, default=3 |
| Convolutional kernel size. Must be odd for proper padding with dilation. |
| growth : float, default=1.0 |
| Channel size multiplier: hidden_dim * (growth ** layer_index). |
| Values > 1.0 grow channels deeper; < 1.0 shrink them. |
| Note: growth != 1.0 disables residual connections between layers |
| with different channel sizes. |
| dilation_growth : int, default=2 |
| Dilation multiplier per layer (e.g., 2 means dilation doubles each layer). |
| Improves receptive field exponentially. Requires odd kernel_size. |
| dilation_period : int, default=5 |
| Reset dilation to 1 every N layers. Prevents dilation from growing |
| too large and maintains local connectivity. |
| conv_drop_prob : float, default=0.0 |
| Dropout probability for convolutional layers. |
| dropout_input : float, default=0.0 |
| Dropout probability applied to model input only. |
| batch_norm : bool, default=True |
| If True, apply batch normalization after each convolution. |
| activation : type[nn.Module], default=nn.GELU |
| Activation function class to use (e.g., nn.GELU, nn.ReLU, nn.ELU). |
| n_subjects : int, default=200 |
| Number of unique subjects (for subject-specific pathways). |
| Only used if subject_dim > 0. |
| subject_dim : int, default=0 |
| Dimension of subject embeddings. If 0, no subject-specific features. |
| If > 0, adds subject embeddings to the input before encoding. |
| subject_layers : bool, default=False |
| If True, apply subject-specific linear transformations to input channels. |
| Each subject has its own weight matrix. Requires subject_dim > 0. |
| subject_layers_dim : str, default="input" |
| Where to apply subject layers: "input" or "hidden". |
| subject_layers_id : bool, default=False |
| If True, initialize subject layers as identity matrices. |
| embedding_scale : float, default=1.0 |
| Scaling factor for subject embeddings learning rate. |
| n_fft : int, optional |
| FFT size for STFT processing. If None, no STFT is applied. |
| If specified, applies spectrogram transform before encoding. |
| fft_complex : bool, default=True |
| If True, keep complex spectrogram. If False, use power spectrogram. |
| Only used when n_fft is not None. |
| channel_dropout_prob : float, default=0.0 |
| Probability of dropping each channel during training (0.0 to 1.0). |
| If 0.0, no channel dropout is applied. |
| channel_dropout_type : str, optional |
| If specified with chs_info, only drop channels of this type |
| (e.g., 'eeg', 'ref', 'eog'). If None with dropout_prob > 0, drops any channel. |
| glu : int, default=2 |
| If > 0, applies Gated Linear Units (GLU) every N convolutional layers. |
| GLUs gate intermediate representations for more expressivity. |
| If 0, no GLU is applied. |
| glu_context : int, default=1 |
| Context window size for GLU gates. If > 0, uses contextual information |
| from neighboring time steps for gating. Requires glu > 0. |
| |
| References |
| ---------- |
| .. [brainmagick] Défossez, A., Caucheteux, C., Rapin, J., Kabeli, O., & King, J. R. |
| (2023). Decoding speech perception from non-invasive brain recordings. Nature |
| Machine Intelligence, 5(10), 1097-1107. |
| |
| Notes |
| ----- |
| - Input shape: (batch, n_chans, n_times) |
| - Output shape: (batch, n_outputs) |
| - The model uses dilated convolutions with stride=1 to maintain temporal |
| resolution while achieving large receptive fields. |
| - Residual connections are applied at every layer where input and output |
| channels match. |
| - Subject-specific features (subject_dim > 0, subject_layers) require passing |
| subject indices in the forward pass as an optional parameter or via batch. |
| - STFT processing (n_fft > 0) automatically transforms input to spectrogram domain. |
|
|
| .. versionadded:: 1.2 |
|
|
| .. rubric:: Hugging Face Hub integration |
|
|
| When the optional ``huggingface_hub`` package is installed, all models |
| automatically gain the ability to be pushed to and loaded from the |
| Hugging Face Hub. Install with:: |
|
|
| pip install braindecode[hub] |
| |
| **Pushing a model to the Hub:** |
|
|
| .. code:: |
| from braindecode.models import BrainModule |
| |
| # Train your model |
| model = BrainModule(n_chans=22, n_outputs=4, n_times=1000) |
| # ... training code ... |
| |
| # Push to the Hub |
| model.push_to_hub( |
| repo_id="username/my-brainmodule-model", |
| commit_message="Initial model upload", |
| ) |
| |
| **Loading a model from the Hub:** |
|
|
| .. code:: |
| from braindecode.models import BrainModule |
| |
| # Load pretrained model |
| model = BrainModule.from_pretrained("username/my-brainmodule-model") |
| |
| # Load with a different number of outputs (head is rebuilt automatically) |
| model = BrainModule.from_pretrained("username/my-brainmodule-model", n_outputs=4) |
| |
| **Extracting features and replacing the head:** |
|
|
| .. code:: |
| import torch |
| |
| x = torch.randn(1, model.n_chans, model.n_times) |
| # Extract encoder features (consistent dict across all models) |
| out = model(x, return_features=True) |
| features = out["features"] |
| |
| # Replace the classification head |
| model.reset_head(n_outputs=10) |
| |
| **Saving and restoring full configuration:** |
|
|
| .. code:: |
| import json |
| |
| config = model.get_config() # all __init__ params |
| with open("config.json", "w") as f: |
| json.dump(config, f) |
| |
| model2 = BrainModule.from_config(config) # reconstruct (no weights) |
| |
| All model parameters (both EEG-specific and model-specific such as |
| dropout rates, activation functions, number of filters) are automatically |
| saved to the Hub and restored when loading. |
|
|
| See :ref:`load-pretrained-models` for a complete tutorial.</main> |
| </div> |
|
|
| ## Citation |
|
|
| Please cite both the original paper for this architecture (see the |
| *References* section above) and braindecode: |
|
|
| ```bibtex |
| @article{aristimunha2025braindecode, |
| title = {Braindecode: a deep learning library for raw electrophysiological data}, |
| author = {Aristimunha, Bruno and others}, |
| journal = {Zenodo}, |
| year = {2025}, |
| doi = {10.5281/zenodo.17699192}, |
| } |
| ``` |
|
|
| ## License |
|
|
| BSD-3-Clause for the model code (matching braindecode). |
| Pretraining-derived weights, if you fine-tune from a checkpoint, |
| inherit the licence of that checkpoint and its training corpus. |
|
|