bruAristimunha commited on
Commit
db4670f
·
verified ·
1 Parent(s): d94a3a6

Add architecture-only model card

Browse files
Files changed (1) hide show
  1. README.md +299 -0
README.md ADDED
@@ -0,0 +1,299 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsd-3-clause
3
+ library_name: braindecode
4
+ pipeline_tag: feature-extraction
5
+ tags:
6
+ - eeg
7
+ - biosignal
8
+ - pytorch
9
+ - neuroscience
10
+ - braindecode
11
+ - convolutional
12
+ ---
13
+
14
+ # DGCNN
15
+
16
+ DGCNN for EEG classification from Song et al. (2018) .
17
+
18
+ > **Architecture-only repository.** This repo documents the
19
+ > `braindecode.models.DGCNN` class. **No pretrained weights are
20
+ > distributed here** — instantiate the model and train it on your own
21
+ > data, or fine-tune from a published foundation-model checkpoint
22
+ > separately.
23
+
24
+ ## Quick start
25
+
26
+ ```bash
27
+ pip install braindecode
28
+ ```
29
+
30
+ ```python
31
+ from braindecode.models import DGCNN
32
+
33
+ model = DGCNN(
34
+ n_chans=22,
35
+ sfreq=250,
36
+ input_window_seconds=4.0,
37
+ n_outputs=4,
38
+ )
39
+ ```
40
+
41
+ The signal-shape arguments above are example defaults — adjust them
42
+ to match your recording.
43
+
44
+ ## Documentation
45
+
46
+ - Full API reference (parameters, references, architecture figure):
47
+ <https://braindecode.org/stable/generated/braindecode.models.DGCNN.html>
48
+ - Interactive browser with live instantiation:
49
+ <https://huggingface.co/spaces/braindecode/model-explorer>
50
+ - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/dgcnn.py#L253>
51
+
52
+ ## Architecture description
53
+
54
+ The block below is the rendered class docstring (parameters,
55
+ references, architecture figure where available).
56
+
57
+ <div class='bd-doc'><main>
58
+ <p>DGCNN for EEG classification from Song et al. (2018) [dgcnn]_.</p>
59
+ <span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#f0f0f0;color:white;font-size:11px;font-weight:600;margin-right:4px;">Graph Neural Network</span>
60
+
61
+ :bdg-dark-line:`Channel`
62
+
63
+ .. figure:: ../_static/model/DGCNN.gif
64
+ :align: center
65
+ :alt: DGCNN Architecture
66
+ :width: 600px
67
+
68
+ .. rubric:: Architectural Overview
69
+
70
+ DGCNN is a *graph-based* architecture that models EEG channels as nodes
71
+ in a graph and **dynamically learns the adjacency matrix**
72
+ :math:`\mathbf{W}^*` jointly with all other parameters via
73
+ back-propagation (Algorithm 1 in [dgcnn]_). The end-to-end flow is:
74
+
75
+ - (i) learn inter-channel relationships by dynamically updating a
76
+ trainable adjacency matrix,
77
+ - (ii) apply spectral graph convolution via Chebyshev polynomial
78
+ approximation to extract graph-structured features, and
79
+ - (iii) classify with a fully connected head.
80
+
81
+ Different from traditional GCNN methods that predetermine the connections
82
+ of the graph nodes according to their spatial positions, "the proposed
83
+ DGCNN method learns the adjacency matrix in a dynamic way, i.e., the
84
+ entries of the adjacency matrix are adaptively updated with the changes
85
+ of graph model parameters during the model training" [dgcnn]_.
86
+
87
+ .. rubric:: Macro Components
88
+
89
+ - :class:`_LearnableAdjacency` **(Dynamical adjacency → graph Laplacian)**
90
+
91
+ - *Operations.*
92
+ - A trainable :math:`(N \times N)` matrix :math:`\mathbf{W}^*`
93
+ initialized from electrode spatial positions via a Gaussian kernel
94
+ (Eq. 1): :math:`w_{ij} = \exp(-\mathrm{dist}(i,j)^2 / 2\rho^2)`
95
+ for the :math:`k`-nearest neighbors, zero otherwise.
96
+ - **ReLU** applied after every gradient update to keep all entries
97
+ non-negative (Algorithm 1, step 3).
98
+ - The normalized graph Laplacian is derived as (Eq. 2):
99
+ :math:`\mathbf{L} = \mathbf{I}
100
+ - \mathbf{D}^{-1/2}\,\mathbf{W}^*\,\mathbf{D}^{-1/2}`.
101
+
102
+ The adjacency matrix captures intrinsic functional relationships
103
+ between EEG channels that pure spatial proximity may not reflect.
104
+
105
+ - :class:`_GraphConvolution` **(Chebyshev spectral graph convolution +
106
+ 1x1 mixing)**
107
+
108
+ - *Operations.*
109
+ - :math:`K`-order Chebyshev polynomial expansion of spectral graph
110
+ filters on the learned Laplacian (Eqs. 11-13):
111
+
112
+ .. math::
113
+
114
+ \mathbf{y}
115
+ = \sum_{k=0}^{K-1} \theta_k\, T_k(\tilde{\mathbf{L}}^*)\,
116
+ \mathbf{x},
117
+
118
+ where :math:`T_k` are Chebyshev polynomials computed recursively
119
+ (Eq. 12) and :math:`\theta_k` are learnable coefficients.
120
+ - A :math:`1 \times 1` convolution (linear projection) that mixes
121
+ the concatenated Chebyshev components, mapping each node's input
122
+ features to ``n_filters`` output features.
123
+
124
+ "Following the graph filtering operation is a :math:`1 \times 1`
125
+ convolution layer, which aims to learn the discriminative features
126
+ among the various frequency domains" [dgcnn]_.
127
+
128
+ - **Activation layer.** ReLU with a learnable per-feature bias ensures
129
+ non-negative outputs of the graph filtering layer [dgcnn]_.
130
+
131
+ - **Classifier Head.**
132
+ Flatten all node features and classify via a multi-layer fully
133
+ connected network with dropout and softmax.
134
+
135
+ .. rubric:: Graph Convolution Details
136
+
137
+ - **Spatial (graph structure).** The adjacency matrix encodes pairwise
138
+ relationships between EEG channels. It is initialized from 3-D
139
+ electrode positions using a Gaussian kernel with kNN sparsification
140
+ (Eq. 1), then *jointly optimized* with all other parameters. This
141
+ allows the model to discover functional connectivity patterns that
142
+ differ from the initial spatial layout. The spectral graph
143
+ convolution then propagates information across neighboring nodes
144
+ according to this learned graph topology.
145
+
146
+ - **Spectral (graph spectral domain).** The Chebyshev polynomial
147
+ approximation (Eq. 11) operates in the *graph spectral domain*
148
+ defined by the eigenvalues of the graph Laplacian. The :math:`K`-order
149
+ approximation acts as a localized graph filter: each node aggregates
150
+ information from its :math:`K`-hop neighborhood. This is analogous
151
+ to a band-pass filter in the graph frequency domain.
152
+
153
+ - **Temporal / Frequency.** No explicit temporal convolution or
154
+ frequency decomposition is performed within the network. In the
155
+ original paper, the input features per node are pre-extracted
156
+ frequency-band features (e.g., differential entropy from
157
+ :math:`\delta`, :math:`\theta`, :math:`\alpha`, :math:`\beta`,
158
+ :math:`\gamma` bands). When used with raw time series, the time
159
+ samples serve directly as node features.
160
+
161
+ .. rubric:: Additional Comments
162
+
163
+ - **Dynamic vs. static graph.** Traditional GCNN methods fix the
164
+ adjacency matrix before training based on spatial positions.
165
+ DGCNN learns it end-to-end, allowing the graph to capture
166
+ task-relevant functional connectivity rather than mere spatial
167
+ proximity.
168
+ - **Chebyshev order.** The order :math:`K` controls the receptive
169
+ field on the graph: :math:`K=1` uses only direct neighbors,
170
+ :math:`K=2` (default) reaches 2-hop neighborhoods. Higher orders
171
+ increase expressivity but also parameter count.
172
+ - **Regularization.** Dropout in the classification head and the
173
+ ReLU constraint on the adjacency matrix provide implicit
174
+ regularization. The loss function in the original paper also
175
+ includes an explicit :math:`\ell_2` penalty on all parameters
176
+ (Eq. 14).
177
+
178
+ Parameters
179
+ ----------
180
+ chs_info : list of dict, optional
181
+ Information about each channel, typically obtained from
182
+ ``mne.Info['chs']``. Each entry must contain a ``'loc'``
183
+ key with 3-D electrode positions so the initial adjacency
184
+ matrix can be built from spatial proximity (Eq. 1). A montage
185
+ must be set on the ``mne.Info`` object (see
186
+ :meth:`mne.Info.set_montage`). If ``None`` or positions
187
+ cannot be extracted, raised ValueError (see Notes).
188
+ n_filters : int, default=64
189
+ Number of spectral graph-convolutional filters. This is the
190
+ output feature dimension per node produced by the Chebyshev
191
+ graph convolution followed by the :math:`1 \times 1`
192
+ convolution (see Fig. 2 in the paper). The original code
193
+ uses 64.
194
+ cheb_order : int, default=2
195
+ Order :math:`K` of the Chebyshev polynomial approximation
196
+ (Eq. 11).
197
+ n_neighbors : int, default=5
198
+ Number of spatial nearest neighbors per node used to build the
199
+ initial adjacency matrix (Eq. 1).
200
+ mlp_dims : tuple[int, ...], default=(256,)
201
+ Hidden-layer sizes of the fully connected classification head.
202
+ activation : type[nn.Module], default=nn.ReLU
203
+ Activation function class used after the graph convolution and
204
+ in the classification head.
205
+ drop_prob : float, default=0.5
206
+ Dropout probability in the classification head.
207
+
208
+ References
209
+ ----------
210
+ .. [dgcnn] Song, T., Zheng, W., Song, P., & Cui, Z. (2018). EEG emotion
211
+ recognition using dynamical graph convolutional neural networks.
212
+ IEEE Transactions on Affective Computing, 11(3), 532-541.
213
+ https://doi.org/10.1109/TAFFC.2018.2817622
214
+
215
+ .. rubric:: Hugging Face Hub integration
216
+
217
+ When the optional ``huggingface_hub`` package is installed, all models
218
+ automatically gain the ability to be pushed to and loaded from the
219
+ Hugging Face Hub. Install with::
220
+
221
+ pip install braindecode[hub]
222
+
223
+ **Pushing a model to the Hub:**
224
+
225
+ .. code::
226
+ from braindecode.models import DGCNN
227
+
228
+ # Train your model
229
+ model = DGCNN(n_chans=22, n_outputs=4, n_times=1000)
230
+ # ... training code ...
231
+
232
+ # Push to the Hub
233
+ model.push_to_hub(
234
+ repo_id="username/my-dgcnn-model",
235
+ commit_message="Initial model upload",
236
+ )
237
+
238
+ **Loading a model from the Hub:**
239
+
240
+ .. code::
241
+ from braindecode.models import DGCNN
242
+
243
+ # Load pretrained model
244
+ model = DGCNN.from_pretrained("username/my-dgcnn-model")
245
+
246
+ # Load with a different number of outputs (head is rebuilt automatically)
247
+ model = DGCNN.from_pretrained("username/my-dgcnn-model", n_outputs=4)
248
+
249
+ **Extracting features and replacing the head:**
250
+
251
+ .. code::
252
+ import torch
253
+
254
+ x = torch.randn(1, model.n_chans, model.n_times)
255
+ # Extract encoder features (consistent dict across all models)
256
+ out = model(x, return_features=True)
257
+ features = out["features"]
258
+
259
+ # Replace the classification head
260
+ model.reset_head(n_outputs=10)
261
+
262
+ **Saving and restoring full configuration:**
263
+
264
+ .. code::
265
+ import json
266
+
267
+ config = model.get_config() # all __init__ params
268
+ with open("config.json", "w") as f:
269
+ json.dump(config, f)
270
+
271
+ model2 = DGCNN.from_config(config) # reconstruct (no weights)
272
+
273
+ All model parameters (both EEG-specific and model-specific such as
274
+ dropout rates, activation functions, number of filters) are automatically
275
+ saved to the Hub and restored when loading.
276
+
277
+ See :ref:`load-pretrained-models` for a complete tutorial.</main>
278
+ </div>
279
+
280
+ ## Citation
281
+
282
+ Please cite both the original paper for this architecture (see the
283
+ *References* section above) and braindecode:
284
+
285
+ ```bibtex
286
+ @article{aristimunha2025braindecode,
287
+ title = {Braindecode: a deep learning library for raw electrophysiological data},
288
+ author = {Aristimunha, Bruno and others},
289
+ journal = {Zenodo},
290
+ year = {2025},
291
+ doi = {10.5281/zenodo.17699192},
292
+ }
293
+ ```
294
+
295
+ ## License
296
+
297
+ BSD-3-Clause for the model code (matching braindecode).
298
+ Pretraining-derived weights, if you fine-tune from a checkpoint,
299
+ inherit the licence of that checkpoint and its training corpus.