File size: 14,898 Bytes
2e2923c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 | ---
license: bsd-3-clause
library_name: braindecode
pipeline_tag: feature-extraction
tags:
- eeg
- biosignal
- pytorch
- neuroscience
- braindecode
- convolutional
---
# BrainModule
BrainModule from , also known as SimpleConv.
> **Architecture-only repository.** This repo documents the
> `braindecode.models.BrainModule` class. **No pretrained weights are
> distributed here** — instantiate the model and train it on your own
> data, or fine-tune from a published foundation-model checkpoint
> separately.
## Quick start
```bash
pip install braindecode
```
```python
from braindecode.models import BrainModule
model = BrainModule(
n_chans=22,
sfreq=250,
input_window_seconds=4.0,
n_outputs=4,
)
```
The signal-shape arguments above are example defaults — adjust them
to match your recording.
## Documentation
- Full API reference (parameters, references, architecture figure):
<https://braindecode.org/stable/generated/braindecode.models.BrainModule.html>
- Interactive browser with live instantiation:
<https://huggingface.co/spaces/braindecode/model-explorer>
- Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/brainmodule.py#L25>
## Architecture description
The block below is the rendered class docstring (parameters,
references, architecture figure where available).
<div class='bd-doc'><main>
<p>BrainModule from [brainmagick]_, also known as SimpleConv.</p>
<blockquote>
<p>A dilated convolutional encoder for EEG decoding, using residual
connections and optional GLU gating for improved expressivity.</p>
</blockquote>
<span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#5cb85c;color:white;font-size:11px;font-weight:600;margin-right:4px;">Convolution</span>
.. figure:: ../_static/model/simpleconv.png
:align: center
:alt: BrainModule Architecture
:width: 500px
Figure adapted Extended Data Fig. 4 from [brainmagick]_ to highlight only the model part.
Architecture of the brain module. Architecture used to process the brain recordings.
For each layer, the authors note first the number of output channels, while the number of time steps
is constant throughout the layers. The model is composed of a spatial attention layer,
then a 1x1 convolution without activation. A 'Subject Layer' is selected based on the subject index s,
which consists in a 1x1 convolution learnt only for that subject with no activation. Then,
the authors apply five convolutional blocks made of three convolutions. The first
two use residual skip connection and increasing dilation, followed by a BatchNorm layer and a
GELU activation. The third convolution is not residual, and uses a GLU activation
(which halves the number of channels) and no normalization.
Finally, the authors apply two 1x1 convolutions with a GELU in between.
The BrainModule (also referred to as SimpleConv) is a deep dilated
convolutional encoder specifically designed to decode perceived speech from
non-invasive brain recordings like EEG and MEG. It is engineered to address
the high noise levels and inter-individual variability inherent in
non-invasive neuroimaging by using a single architecture trained across
large cohorts while accommodating participant-specific differences.
.. rubric:: Architecture Overview
The BrainModule integrates three primary mechanisms to align brain activity
with deep speech representations:
1. **Spatial-temporal feature extraction.** The model uses a dedicated
spatial attention layer to remap sensor data based on physical
locations, followed by temporal processing through dilated convolutions.
2. **Subject-specific adaptation.** To leverage inter-subject variability,
the architecture includes a "Subject Layer" or participant-specific
1x1 convolution that allows the model to share core weights across a
cohort while learning individual-specific neural patterns.
3. **Dilated residual blocks with gating.** The core encoder employs a
stack of convolutional blocks featuring skip connections and increasing
dilation to expand the receptive field without losing temporal
resolution, supplemented by optional Gated Linear Units (GLU) for
increased expressivity.
.. rubric:: Macro Components
``BrainModule.input_projection`` (Initial Processing)
**Operations.** Raw M/EEG input
:math:`\mathbf{X} \in \mathbb{R}^{C \times T}` is first processed
through a spatial attention layer that projects sensor locations onto a
2D plane using Fourier-parameterized functions. This is followed by a
subject-specific 1x1 convolution
:math:`\mathbf{M}_s \in \mathbb{R}^{D_1 \times D_1}` if subject
features are enabled. The resulting features are projected to the
``hidden_dim`` (default 320) to ensure compatibility with subsequent
residual connections.
**Role.** Converts high-dimensional, subject-dependent sensor data into
a standardized latent space while preserving spatial and temporal
relationships.
``BrainModule.encoder`` (Convolutional Sequence)
**Operations.** Implemented via
:class:`~braindecode.models.brainmodule._ConvSequence`, this component
consists of a stack of ``k`` convolutional blocks. Each block typically
contains: (a) **Residual dilated convolutions.** Two layers with kernel
size 3, residual skip connections, and dilation factors that grow
exponentially (e.g., powers of two with periodic resets) to capture
multi-scale temporal context. (b) **GLU gating.** Every ``N`` layers
(defined by ``glu``), a Gated Linear Unit is applied, which halves the
channel dimension and introduces non-linear gating to filter
intermediate representations.
**Role.** Extracts deep hierarchical temporal features from the brain
signal, significantly expanding the model's receptive field to align
with the contextual windows of speech modules like wav2vec 2.0.
.. rubric:: Temporal, Spatial, and Spectral Encoding
- **Temporal:** Increasing dilation factors across layers allow the model to
integrate information over large time windows without the computational
cost of standard large kernels, while a 150 ms input shift facilitates
alignment between stimulus and brain response.
- **Spatial:** The spatial attention layer learns a softmax weighting over
input sensors based on their 3D coordinates, allowing the model to focus
on regions typically activated during auditory stimulation (e.g., the
temporal cortex).
- **Spectral:** Through the optional ``n_fft`` parameter, the model can
apply an STFT transformation, converting time-domain signals into a
spectrogram representation before encoding.
.. rubric:: Additional Mechanisms
- **Clamping and scaling:** The model relies on clamping input values
(e.g., at 20 standard deviations) to prevent outliers and large
electromagnetic artifacts from destabilizing the BatchNorm estimates and
optimization process.
- **Scaled subject embeddings:** When ``subject_dim`` is used, the
:class:`~braindecode.models.brainmodule._ScaledEmbedding` layer scales up
the learning rate for subject-specific features to prevent slow
convergence in multi-participant training.
- **_ConvSequence and residual logic:** This class handles the actual
stacking of layers. It is designed to be flexible with the ``growth``
parameter; if the channel size changes between layers (``growth != 1.0``),
it automatically applies a 1x1 ``skip_projection`` convolution to the
residual path so dimensions match for addition.
- **_ChannelDropout:** Unlike standard dropout which zeroes individual
neurons, this zeroes entire channels. It includes a rescale feature that
multiplies the remaining channels by a factor
``total_channels / active_channels`` to maintain the expected value of the
signal during training.
- **_ScaledEmbedding:** This is a clever optimization for multi-subject
learning. By dividing the initial weights by a scale and then multiplying
the output by the same scale, it effectively increases the gradient
magnitude for the embedding weights, allowing subject-specific features to
learn faster than the shared backbone.
Parameters
----------
hidden_dim : int, default=320
Hidden dimension for convolutional layers. Input is projected to this
dimension before the convolutional blocks.
depth : int, default=10
Number of convolutional blocks. Each block contains a dilated convolution
with batch normalization and activation, followed by a residual connection.
kernel_size : int, default=3
Convolutional kernel size. Must be odd for proper padding with dilation.
growth : float, default=1.0
Channel size multiplier: hidden_dim * (growth ** layer_index).
Values > 1.0 grow channels deeper; < 1.0 shrink them.
Note: growth != 1.0 disables residual connections between layers
with different channel sizes.
dilation_growth : int, default=2
Dilation multiplier per layer (e.g., 2 means dilation doubles each layer).
Improves receptive field exponentially. Requires odd kernel_size.
dilation_period : int, default=5
Reset dilation to 1 every N layers. Prevents dilation from growing
too large and maintains local connectivity.
conv_drop_prob : float, default=0.0
Dropout probability for convolutional layers.
dropout_input : float, default=0.0
Dropout probability applied to model input only.
batch_norm : bool, default=True
If True, apply batch normalization after each convolution.
activation : type[nn.Module], default=nn.GELU
Activation function class to use (e.g., nn.GELU, nn.ReLU, nn.ELU).
n_subjects : int, default=200
Number of unique subjects (for subject-specific pathways).
Only used if subject_dim > 0.
subject_dim : int, default=0
Dimension of subject embeddings. If 0, no subject-specific features.
If > 0, adds subject embeddings to the input before encoding.
subject_layers : bool, default=False
If True, apply subject-specific linear transformations to input channels.
Each subject has its own weight matrix. Requires subject_dim > 0.
subject_layers_dim : str, default="input"
Where to apply subject layers: "input" or "hidden".
subject_layers_id : bool, default=False
If True, initialize subject layers as identity matrices.
embedding_scale : float, default=1.0
Scaling factor for subject embeddings learning rate.
n_fft : int, optional
FFT size for STFT processing. If None, no STFT is applied.
If specified, applies spectrogram transform before encoding.
fft_complex : bool, default=True
If True, keep complex spectrogram. If False, use power spectrogram.
Only used when n_fft is not None.
channel_dropout_prob : float, default=0.0
Probability of dropping each channel during training (0.0 to 1.0).
If 0.0, no channel dropout is applied.
channel_dropout_type : str, optional
If specified with chs_info, only drop channels of this type
(e.g., 'eeg', 'ref', 'eog'). If None with dropout_prob > 0, drops any channel.
glu : int, default=2
If > 0, applies Gated Linear Units (GLU) every N convolutional layers.
GLUs gate intermediate representations for more expressivity.
If 0, no GLU is applied.
glu_context : int, default=1
Context window size for GLU gates. If > 0, uses contextual information
from neighboring time steps for gating. Requires glu > 0.
References
----------
.. [brainmagick] Défossez, A., Caucheteux, C., Rapin, J., Kabeli, O., & King, J. R.
(2023). Decoding speech perception from non-invasive brain recordings. Nature
Machine Intelligence, 5(10), 1097-1107.
Notes
-----
- Input shape: (batch, n_chans, n_times)
- Output shape: (batch, n_outputs)
- The model uses dilated convolutions with stride=1 to maintain temporal
resolution while achieving large receptive fields.
- Residual connections are applied at every layer where input and output
channels match.
- Subject-specific features (subject_dim > 0, subject_layers) require passing
subject indices in the forward pass as an optional parameter or via batch.
- STFT processing (n_fft > 0) automatically transforms input to spectrogram domain.
.. versionadded:: 1.2
.. rubric:: Hugging Face Hub integration
When the optional ``huggingface_hub`` package is installed, all models
automatically gain the ability to be pushed to and loaded from the
Hugging Face Hub. Install with::
pip install braindecode[hub]
**Pushing a model to the Hub:**
.. code::
from braindecode.models import BrainModule
# Train your model
model = BrainModule(n_chans=22, n_outputs=4, n_times=1000)
# ... training code ...
# Push to the Hub
model.push_to_hub(
repo_id="username/my-brainmodule-model",
commit_message="Initial model upload",
)
**Loading a model from the Hub:**
.. code::
from braindecode.models import BrainModule
# Load pretrained model
model = BrainModule.from_pretrained("username/my-brainmodule-model")
# Load with a different number of outputs (head is rebuilt automatically)
model = BrainModule.from_pretrained("username/my-brainmodule-model", n_outputs=4)
**Extracting features and replacing the head:**
.. code::
import torch
x = torch.randn(1, model.n_chans, model.n_times)
# Extract encoder features (consistent dict across all models)
out = model(x, return_features=True)
features = out["features"]
# Replace the classification head
model.reset_head(n_outputs=10)
**Saving and restoring full configuration:**
.. code::
import json
config = model.get_config() # all __init__ params
with open("config.json", "w") as f:
json.dump(config, f)
model2 = BrainModule.from_config(config) # reconstruct (no weights)
All model parameters (both EEG-specific and model-specific such as
dropout rates, activation functions, number of filters) are automatically
saved to the Hub and restored when loading.
See :ref:`load-pretrained-models` for a complete tutorial.</main>
</div>
## Citation
Please cite both the original paper for this architecture (see the
*References* section above) and braindecode:
```bibtex
@article{aristimunha2025braindecode,
title = {Braindecode: a deep learning library for raw electrophysiological data},
author = {Aristimunha, Bruno and others},
journal = {Zenodo},
year = {2025},
doi = {10.5281/zenodo.17699192},
}
```
## License
BSD-3-Clause for the model code (matching braindecode).
Pretraining-derived weights, if you fine-tune from a checkpoint,
inherit the licence of that checkpoint and its training corpus.
|