braindecode
/

AttnSleep

@@ -15,13 +15,12 @@ tags:
 # AttnSleep
-Sleep Staging Architecture from Eldele et al  (2021) .
-> **Architecture-only repository.** This repo documents the
 > `braindecode.models.AttnSleep` class. **No pretrained weights are
-> distributed here** — instantiate the model and train it on your own
-> data, or fine-tune from a published foundation-model checkpoint
-> separately.
 ## Quick start
@@ -40,158 +39,48 @@ model = AttnSleep(
 )
 ```
-The signal-shape arguments above are example defaults — adjust them
-to match your recording.
 ## Documentation
-- Full API reference (parameters, references, architecture figure):
-  <https://braindecode.org/stable/generated/braindecode.models.AttnSleep.html>
-- Interactive browser with live instantiation:
   <https://huggingface.co/spaces/braindecode/model-explorer>
 - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/attn_sleep.py#L18>
-## Architecture description
-The block below is the rendered class docstring (parameters,
-references, architecture figure where available).
-<div class='bd-doc'><main>
-<p>Sleep Staging Architecture from Eldele et al  (2021) [Eldele2021]_.</p>
-<span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#5cb85c;color:white;font-size:11px;font-weight:600;margin-right:4px;">Convolution</span><span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#56B4E9;color:white;font-size:11px;font-weight:600;margin-right:4px;">Attention/Transformer</span>
- .. figure:: https://raw.githubusercontent.com/emadeldeen24/AttnSleep/refs/heads/main/imgs/AttnSleep.png
-     :align: center
-     :alt: AttnSleep Architecture
- Attention based Neural Net for sleep staging as described in [Eldele2021]_.
- The code for the paper and this model is also available at [1]_.
- Takes single channel EEG as input.
- Feature extraction module based on multi-resolution convolutional neural network (MRCNN)
- and adaptive feature recalibration (AFR).
- The second module is the temporal context encoder (TCE) that leverages a multi-head attention
- mechanism to capture the temporal dependencies among the extracted features.
- Warning - This model was designed for signals of 30 seconds at 100Hz or 125Hz (in which case
- the reference architecture from [1]_ which was validated on SHHS dataset [2]_ will be used)
- to use any other input is likely to make the model perform in unintended ways.
- Parameters
- ----------
- n_tce : int
-     Number of TCE clones.
- d_model : int
-     Input dimension for the TCE.
-     Also the input dimension of the first FC layer in the feed forward
-     and the output of the second FC layer in the same.
-     Increase for higher sampling rate/signal length.
-     It should be divisible by n_attn_heads
- d_ff : int
-     Output dimension of the first FC layer in the feed forward and the
-     input dimension of the second FC layer in the same.
- n_attn_heads : int
-     Number of attention heads. It should be a factor of d_model
- drop_prob : float
-     Dropout rate in the PositionWiseFeedforward layer and the TCE layers.
- after_reduced_cnn_size : int
-     Number of output channels produced by the convolution in the AFR module.
- return_feats : bool
-     If True, return the features, i.e. the output of the feature extractor
-     (before the final linear layer). If False, pass the features through
-     the final linear layer.
- n_classes : int
-     Alias for `n_outputs`.
- input_size_s : float
-     Alias for `input_window_seconds`.
- activation : nn.Module, default=nn.ReLU
-     Activation function class to apply. Should be a PyTorch activation
-     module class like ``nn.ReLU`` or ``nn.ELU``. Default is ``nn.ReLU``.
- activation_mrcnn : nn.Module, default=nn.ReLU
-     Activation function class to apply in the Mask R-CNN layer.
-     Should be a PyTorch activation module class like ``nn.ReLU`` or
-     ``nn.GELU``. Default is ``nn.GELU``.
- References
- ----------
- .. [Eldele2021] E. Eldele et al., "An Attention-Based Deep Learning Approach for Sleep Stage
-     Classification With Single-Channel EEG," in IEEE Transactions on Neural Systems and
-     Rehabilitation Engineering, vol. 29, pp. 809-818, 2021, doi: 10.1109/TNSRE.2021.3076234.
- .. [1] https://github.com/emadeldeen24/AttnSleep
- .. [2] https://sleepdata.org/datasets/shhs
- .. rubric:: Hugging Face Hub integration
- When the optional ``huggingface_hub`` package is installed, all models
- automatically gain the ability to be pushed to and loaded from the
- Hugging Face Hub. Install with::
-     pip install braindecode[hub]
- **Pushing a model to the Hub:**
- .. code::
-     from braindecode.models import AttnSleep
-     # Train your model
-     model = AttnSleep(n_chans=22, n_outputs=4, n_times=1000)
-     # ... training code ...
-     # Push to the Hub
-     model.push_to_hub(
-         repo_id="username/my-attnsleep-model",
-         commit_message="Initial model upload",
-     )
- **Loading a model from the Hub:**
- .. code::
-     from braindecode.models import AttnSleep
-     # Load pretrained model
-     model = AttnSleep.from_pretrained("username/my-attnsleep-model")
-     # Load with a different number of outputs (head is rebuilt automatically)
-     model = AttnSleep.from_pretrained("username/my-attnsleep-model", n_outputs=4)
- **Extracting features and replacing the head:**
- .. code::
-     import torch
-     x = torch.randn(1, model.n_chans, model.n_times)
-     # Extract encoder features (consistent dict across all models)
-     out = model(x, return_features=True)
-     features = out["features"]
-     # Replace the classification head
-     model.reset_head(n_outputs=10)
- **Saving and restoring full configuration:**
- .. code::
-     import json
-     config = model.get_config()            # all __init__ params
-     with open("config.json", "w") as f:
-         json.dump(config, f)
-     model2 = AttnSleep.from_config(config)    # reconstruct (no weights)
- All model parameters (both EEG-specific and model-specific such as
- dropout rates, activation functions, number of filters) are automatically
- saved to the Hub and restored when loading.
- See :ref:`load-pretrained-models` for a complete tutorial.</main>
-</div>
 ## Citation
-Please cite both the original paper for this architecture (see the
-*References* section above) and braindecode:
 ```bibtex
 @article{aristimunha2025braindecode,

 # AttnSleep
+Sleep Staging Architecture from Eldele et al  (2021) [Eldele2021].
+> **Architecture-only repository.** Documents the
 > `braindecode.models.AttnSleep` class. **No pretrained weights are
+> distributed here.** Instantiate the model and train it on your own
+> data.
 ## Quick start
 )
 ```
+The signal-shape arguments above are illustrative defaults — adjust to
+match your recording.
 ## Documentation
+- Full API reference: <https://braindecode.org/stable/generated/braindecode.models.AttnSleep.html>
+- Interactive browser (live instantiation, parameter counts):
   <https://huggingface.co/spaces/braindecode/model-explorer>
 - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/attn_sleep.py#L18>
+## Architecture
+![AttnSleep architecture](https://raw.githubusercontent.com/emadeldeen24/AttnSleep/refs/heads/main/imgs/AttnSleep.png)
+## Parameters
+| Parameter | Type | Description |
+|---|---|---|
+| `n_tce` | int | Number of TCE clones. |
+| `d_model` | int | Input dimension for the TCE. Also the input dimension of the first FC layer in the feed forward and the output of the second FC layer in the same. Increase for higher sampling rate/signal length. It should be divisible by n_attn_heads |
+| `d_ff` | int | Output dimension of the first FC layer in the feed forward and the input dimension of the second FC layer in the same. |
+| `n_attn_heads` | int | Number of attention heads. It should be a factor of d_model |
+| `drop_prob` | float | Dropout rate in the PositionWiseFeedforward layer and the TCE layers. |
+| `after_reduced_cnn_size` | int | Number of output channels produced by the convolution in the AFR module. |
+| `return_feats` | bool | If True, return the features, i.e. the output of the feature extractor (before the final linear layer). If False, pass the features through the final linear layer. |
+| `n_classes` | int | Alias for `n_outputs`. |
+| `input_size_s` | float | Alias for `input_window_seconds`. |
+| `activation` | nn.Module, default=nn.ReLU | Activation function class to apply. Should be a PyTorch activation module class like `nn.ReLU` or `nn.ELU`. Default is `nn.ReLU`. |
+| `activation_mrcnn` | nn.Module, default=nn.ReLU | Activation function class to apply in the Mask R-CNN layer. Should be a PyTorch activation module class like `nn.ReLU` or `nn.GELU`. Default is `nn.GELU`. |
+## References
+1. E. Eldele et al., "An Attention-Based Deep Learning Approach for Sleep Stage Classification With Single-Channel EEG," in IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 29, pp. 809-818, 2021, doi: 10.1109/TNSRE.2021.3076234.
+2. https://github.com/emadeldeen24/AttnSleep
+3. https://sleepdata.org/datasets/shhs
 ## Citation
+Cite the original architecture paper (see *References* above) and braindecode:
 ```bibtex
 @article{aristimunha2025braindecode,