braindecode
/

DGCNN

+---
+license: bsd-3-clause
+library_name: braindecode
+pipeline_tag: feature-extraction
+tags:
+  - eeg
+  - biosignal
+  - pytorch
+  - neuroscience
+  - braindecode
+  - convolutional
+---
+# DGCNN
+DGCNN for EEG classification from Song et al. (2018) .
+> **Architecture-only repository.** This repo documents the
+> `braindecode.models.DGCNN` class. **No pretrained weights are
+> distributed here** — instantiate the model and train it on your own
+> data, or fine-tune from a published foundation-model checkpoint
+> separately.
+## Quick start
+```bash
+pip install braindecode
+```
+```python
+from braindecode.models import DGCNN
+model = DGCNN(
+    n_chans=22,
+    sfreq=250,
+    input_window_seconds=4.0,
+    n_outputs=4,
+)
+```
+The signal-shape arguments above are example defaults — adjust them
+to match your recording.
+## Documentation
+- Full API reference (parameters, references, architecture figure):
+  <https://braindecode.org/stable/generated/braindecode.models.DGCNN.html>
+- Interactive browser with live instantiation:
+  <https://huggingface.co/spaces/braindecode/model-explorer>
+- Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/dgcnn.py#L253>
+## Architecture description
+The block below is the rendered class docstring (parameters,
+references, architecture figure where available).
+<div class='bd-doc'><main>
+<p>DGCNN for EEG classification from Song et al. (2018) [dgcnn]_.</p>
+  <span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#f0f0f0;color:white;font-size:11px;font-weight:600;margin-right:4px;">Graph Neural Network</span>
+:bdg-dark-line:`Channel`
+   .. figure:: ../_static/model/DGCNN.gif
+       :align: center
+       :alt: DGCNN Architecture
+       :width: 600px
+   .. rubric:: Architectural Overview
+   DGCNN is a *graph-based* architecture that models EEG channels as nodes
+   in a graph and **dynamically learns the adjacency matrix**
+   :math:`\mathbf{W}^*` jointly with all other parameters via
+   back-propagation (Algorithm 1 in [dgcnn]_). The end-to-end flow is:
+   - (i) learn inter-channel relationships by dynamically updating a
+     trainable adjacency matrix,
+   - (ii) apply spectral graph convolution via Chebyshev polynomial
+     approximation to extract graph-structured features, and
+   - (iii) classify with a fully connected head.
+   Different from traditional GCNN methods that predetermine the connections
+   of the graph nodes according to their spatial positions, "the proposed
+   DGCNN method learns the adjacency matrix in a dynamic way, i.e., the
+   entries of the adjacency matrix are adaptively updated with the changes
+   of graph model parameters during the model training" [dgcnn]_.
+   .. rubric:: Macro Components
+   - :class:`_LearnableAdjacency` **(Dynamical adjacency → graph Laplacian)**
+       - *Operations.*
+       - A trainable :math:`(N \times N)` matrix :math:`\mathbf{W}^*`
+         initialized from electrode spatial positions via a Gaussian kernel
+         (Eq. 1): :math:`w_{ij} = \exp(-\mathrm{dist}(i,j)^2 / 2\rho^2)`
+         for the :math:`k`-nearest neighbors, zero otherwise.
+       - **ReLU** applied after every gradient update to keep all entries
+         non-negative (Algorithm 1, step 3).
+       - The normalized graph Laplacian is derived as (Eq. 2):
+         :math:`\mathbf{L} = \mathbf{I}
+         - \mathbf{D}^{-1/2}\,\mathbf{W}^*\,\mathbf{D}^{-1/2}`.
+       The adjacency matrix captures intrinsic functional relationships
+       between EEG channels that pure spatial proximity may not reflect.
+   - :class:`_GraphConvolution` **(Chebyshev spectral graph convolution +
+     1x1 mixing)**
+       - *Operations.*
+       - :math:`K`-order Chebyshev polynomial expansion of spectral graph
+         filters on the learned Laplacian (Eqs. 11-13):
+         .. math::
+             \mathbf{y}
+             = \sum_{k=0}^{K-1} \theta_k\, T_k(\tilde{\mathbf{L}}^*)\,
+               \mathbf{x},
+         where :math:`T_k` are Chebyshev polynomials computed recursively
+         (Eq. 12) and :math:`\theta_k` are learnable coefficients.
+       - A :math:`1 \times 1` convolution (linear projection) that mixes
+         the concatenated Chebyshev components, mapping each node's input
+         features to ``n_filters`` output features.
+       "Following the graph filtering operation is a :math:`1 \times 1`
+       convolution layer, which aims to learn the discriminative features
+       among the various frequency domains" [dgcnn]_.
+   - **Activation layer.** ReLU with a learnable per-feature bias ensures
+     non-negative outputs of the graph filtering layer [dgcnn]_.
+   - **Classifier Head.**
+     Flatten all node features and classify via a multi-layer fully
+     connected network with dropout and softmax.
+   .. rubric:: Graph Convolution Details
+   - **Spatial (graph structure).** The adjacency matrix encodes pairwise
+     relationships between EEG channels. It is initialized from 3-D
+     electrode positions using a Gaussian kernel with kNN sparsification
+     (Eq. 1), then *jointly optimized* with all other parameters. This
+     allows the model to discover functional connectivity patterns that
+     differ from the initial spatial layout. The spectral graph
+     convolution then propagates information across neighboring nodes
+     according to this learned graph topology.
+   - **Spectral (graph spectral domain).** The Chebyshev polynomial
+     approximation (Eq. 11) operates in the *graph spectral domain*
+     defined by the eigenvalues of the graph Laplacian. The :math:`K`-order
+     approximation acts as a localized graph filter: each node aggregates
+     information from its :math:`K`-hop neighborhood. This is analogous
+     to a band-pass filter in the graph frequency domain.
+   - **Temporal / Frequency.** No explicit temporal convolution or
+     frequency decomposition is performed within the network. In the
+     original paper, the input features per node are pre-extracted
+     frequency-band features (e.g., differential entropy from
+     :math:`\delta`, :math:`\theta`, :math:`\alpha`, :math:`\beta`,
+     :math:`\gamma` bands). When used with raw time series, the time
+     samples serve directly as node features.
+   .. rubric:: Additional Comments
+   - **Dynamic vs. static graph.** Traditional GCNN methods fix the
+     adjacency matrix before training based on spatial positions.
+     DGCNN learns it end-to-end, allowing the graph to capture
+     task-relevant functional connectivity rather than mere spatial
+     proximity.
+   - **Chebyshev order.** The order :math:`K` controls the receptive
+     field on the graph: :math:`K=1` uses only direct neighbors,
+     :math:`K=2` (default) reaches 2-hop neighborhoods. Higher orders
+     increase expressivity but also parameter count.
+   - **Regularization.** Dropout in the classification head and the
+     ReLU constraint on the adjacency matrix provide implicit
+     regularization. The loss function in the original paper also
+     includes an explicit :math:`\ell_2` penalty on all parameters
+     (Eq. 14).
+   Parameters
+   ----------
+   chs_info : list of dict, optional
+       Information about each channel, typically obtained from
+       ``mne.Info['chs']``.  Each entry must contain a ``'loc'``
+       key with 3-D electrode positions so the initial adjacency
+       matrix can be built from spatial proximity (Eq. 1).  A montage
+       must be set on the ``mne.Info`` object (see
+       :meth:`mne.Info.set_montage`).  If ``None`` or positions
+       cannot be extracted, raised ValueError (see Notes).
+   n_filters : int, default=64
+       Number of spectral graph-convolutional filters.  This is the
+       output feature dimension per node produced by the Chebyshev
+       graph convolution followed by the :math:`1 \times 1`
+       convolution (see Fig. 2 in the paper).  The original code
+       uses 64.
+   cheb_order : int, default=2
+       Order :math:`K` of the Chebyshev polynomial approximation
+       (Eq. 11).
+   n_neighbors : int, default=5
+       Number of spatial nearest neighbors per node used to build the
+       initial adjacency matrix (Eq. 1).
+   mlp_dims : tuple[int, ...], default=(256,)
+       Hidden-layer sizes of the fully connected classification head.
+   activation : type[nn.Module], default=nn.ReLU
+       Activation function class used after the graph convolution and
+       in the classification head.
+   drop_prob : float, default=0.5
+       Dropout probability in the classification head.
+   References
+   ----------
+   .. [dgcnn] Song, T., Zheng, W., Song, P., & Cui, Z. (2018). EEG emotion
+       recognition using dynamical graph convolutional neural networks.
+       IEEE Transactions on Affective Computing, 11(3), 532-541.
+       https://doi.org/10.1109/TAFFC.2018.2817622
+   .. rubric:: Hugging Face Hub integration
+   When the optional ``huggingface_hub`` package is installed, all models
+   automatically gain the ability to be pushed to and loaded from the
+   Hugging Face Hub. Install with::
+       pip install braindecode[hub]
+   **Pushing a model to the Hub:**
+   .. code::
+       from braindecode.models import DGCNN
+       # Train your model
+       model = DGCNN(n_chans=22, n_outputs=4, n_times=1000)
+       # ... training code ...
+       # Push to the Hub
+       model.push_to_hub(
+           repo_id="username/my-dgcnn-model",
+           commit_message="Initial model upload",
+       )
+   **Loading a model from the Hub:**
+   .. code::
+       from braindecode.models import DGCNN
+       # Load pretrained model
+       model = DGCNN.from_pretrained("username/my-dgcnn-model")
+       # Load with a different number of outputs (head is rebuilt automatically)
+       model = DGCNN.from_pretrained("username/my-dgcnn-model", n_outputs=4)
+   **Extracting features and replacing the head:**
+   .. code::
+       import torch
+       x = torch.randn(1, model.n_chans, model.n_times)
+       # Extract encoder features (consistent dict across all models)
+       out = model(x, return_features=True)
+       features = out["features"]
+       # Replace the classification head
+       model.reset_head(n_outputs=10)
+   **Saving and restoring full configuration:**
+   .. code::
+       import json
+       config = model.get_config()            # all __init__ params
+       with open("config.json", "w") as f:
+           json.dump(config, f)
+       model2 = DGCNN.from_config(config)    # reconstruct (no weights)
+   All model parameters (both EEG-specific and model-specific such as
+   dropout rates, activation functions, number of filters) are automatically
+   saved to the Hub and restored when loading.
+   See :ref:`load-pretrained-models` for a complete tutorial.</main>
+</div>
+## Citation
+Please cite both the original paper for this architecture (see the
+*References* section above) and braindecode:
+```bibtex
+@article{aristimunha2025braindecode,
+  title   = {Braindecode: a deep learning library for raw electrophysiological data},
+  author  = {Aristimunha, Bruno and others},
+  journal = {Zenodo},
+  year    = {2025},
+  doi     = {10.5281/zenodo.17699192},
+}
+```
+## License
+BSD-3-Clause for the model code (matching braindecode).
+Pretraining-derived weights, if you fine-tune from a checkpoint,
+inherit the licence of that checkpoint and its training corpus.