File size: 11,257 Bytes
924db9e 5b1eb7f fede6d2 5b1eb7f 3b77d98 5b1eb7f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | ---
license: cc-by-nd-4.0
language:
- en
tags:
- foundation-model
- neuroscience
- eeg
- mamba
---
<div align="center">
<img src="https://raw.githubusercontent.com/danaebroustail/BioFoundation_Danae/refs/heads/lumamba/docs/model/logo/LuMamba_logo.png" alt="LuMamba Logo" width="800"/>
<h1>LuMamba: Latent Unified Mamba for Electrode
Topology-Invariant and Efficient EEG Modeling</h1>
</div>
<p align="center">
<a href="https://github.com/pulp-bio/BioFoundation">
<img src ="https://img.shields.io/github/stars/pulp-bio/BioFoundation?color=ccf" alt="Github">
</a>
<a href="https://creativecommons.org/licenses/by-nd/4.0/">
<img src="https://img.shields.io/badge/License-CC_BY--ND_4.0-lightgrey.svg" alt="License">
</a>
<a href="https://arxiv.org/abs/2603.19100">
<img src="https://img.shields.io/badge/arXiv-2510.22257-b31b1b.svg" alt="Paper">
</a>
</p>
**LuMamba** (Latent Unified Mamba) is an **EEG foundation model** built on efficient **Mamba state-space learning**, capable of handling **heterogeneous channel topologies**.
LuMamba addresses varying channel layouts with **LUNA channel unification**, projecting a given EEG channel layout to a **fixed latent topology**, and overcomes the quadratic complexity of transformers with **FEMBA**'s efficient **bidirectional Mamba encoder**.
---
## 🔒 License & Usage Policy (Weights)
**Weights license:** The released model weights are licensed under **Creative Commons Attribution–NoDerivatives 4.0 (CC BY-ND 4.0)**. This section summarizes the practical implications for users. *This is not legal advice; please read the full license text.*
### ✅ You may
- **Use** and **redistribute** the **unmodified** LuMamba weights (including in commercial settings) **with proper attribution** to the LuMamba authors.
- **Fine-tune / adapt** the weights **for your internal use** (research or production) **without redistributing** the modified weights.
- **Publish your code, configs, logs, and papers** describing experiments with LuMamba (please cite the paper).
### 🚫 You may not
- **Share, host, or redistribute any modified weights** (including LoRA/adapter/delta checkpoints or pruned/quantized variants). Any parameter set that encodes an adaptation is considered a derivative and cannot be shared under CC BY-ND 4.0.
- **Imply endorsement** by the LuMamba authors for any derivative or evaluation without our written permission.
- **Use the LuMamba name** in a way that suggests your modified model is an official LuMamba release.
### 🤝 How to contribute improvements (PR-gated releases)
We welcome community improvements via a **pull-request (PR)** workflow. If you believe your improvements should become an **official LuMamba release**:
1. **Open a PR** in the [BioFoundation repository](https://github.com/pulp-bio/BioFoundation) describing the change (architecture/head/training recipe, datasets, preprocessing, compute).
2. Include **reproducibility artifacts**: configs, seeds, scripts, environment details, training/validation logs, and the **evaluation protocol** (e.g., TUAB/TUAR/TUSL) with exact splits.
3. Provide **comprehensive results** (AUROC/AUPR/BA, FLOPs, memory) vs. the baselines reported in the LuMamba paper.
4. After **maintainer review**, approved changes will be **retrained/validated** and, if accepted, **released by the maintainers** as a new **official LuMamba** checkpoint under **CC BY-ND 4.0**.
> Rationale: CC BY-ND protects users from fragmented, lower-quality “LuMamba variants,” while still enabling internal fine-tuning and a path for the community to upstream improvements through review.
---
## 🔎 Model Summary
- **Goal:** Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
- **Core idea:** **Channel-Unification Module** uses **learned queries** (Q) with **cross-attention** to map any set of channels to a fixed latent space. **bidirectional Mamba blocks** then operate on that latent sequence.
- **Pre-training data:** TUEG, **>21,000 hours** of raw EEG; downstream subjects removed to avoid leakage.
- **Downstream tasks:** **TUAB** (abnormal), **TUAR** (artifacts), **TUSL** (slowing), **SEED-V** (emotion; unseen 62-ch montage), **APAVA** (Alzheimer's disease; unseen 16-ch layout, **TDBrain** (Parkinson's disease; unseen 26-ch layout)
---
## 🚀 Model Variants
The model currently exists in a Tiny Variant, with the following parameters:
| Variant | Parameters | FEMBA parameters |LUNA parameters |
|-----------------|------------|-----------------------------|------------------------------------|
| LuMamba_tiny | 4.1M |(`num_blocks` = 2, `exp` = 2)|(`num_queries` = 6, `embed_dim` = 64)
Larger model sizes can be attained by increasing the number of bi-Mamba blocks `num_blocks` (e.g. 8 bi-Mamba blocks yields 15M parameters).
---
## 📊 Results (Highlights)
- **TUAB (abnormal vs normal):** 80.99 % Bal. Acc., 0.883 AUROC, 0.892 AUPR.
(LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **APAVA (Alzheimer's detection)**: 0.955 AUROC, 0.970 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **TDBrain (Parkinson's detection)**: 0.961 AUROC, 0.960 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
**Efficiency:** Up to **377× fewer FLOPs** relative to transformer-based baselines and supporting up to **500x longer** EEG windows, thanks to the efficient FEMBA bi-Mamba encoder.
---
## 🧠 Intended Use & Limitations
**Intended use.** Research on EEG representation learning & classification (abnormality, artifacts, slowing, emotion), especially when **montages vary** or **channel counts are high**.
**Limitations.**
- **Not a medical device.** Do **not** use for clinical decisions without proper validation & regulatory clearance.
- **Unseen topologies:** Zero-shot transfer to **very different/dense** layouts (e.g., SEED-V) can underperform SOTA despite positive scaling; consider augmenting pre-training montage diversity and spatial encodings.
- **Distribution shifts:** Performance varies across cohorts, devices, and label protocols; validate locally and consider domain adaptation.
---
## 🏗️ Architecture & Training
**LUNA Tokenizer & features.** EEG is patch-segmented; temporal features via 1D conv w/ GroupNorm+GELU; **frequency features** (FFT mag/phase → MLP) are added; 3D electrode coordinates encoded via **NeRF-style sinusoids → MLP** (positional enc).
**LUNA Channel-Unification Module.** **Q learned queries** cross-attend to **channel-wise patch features** to produce a **fixed Q×E latent** per patch; FFN + Transformer layers refine the query tokens. Complexity is **O(Q·C)** (linear in channels).
**FEMBA Bi-Mamba Temporal encoder.** **Mamba blocks** process the embeddings in separate forward and backward streams.
**Pre-training objectives.** **Masked-patch reconstruction** is used to reconstruct masked tokens. In parallel, the **LeJEPA loss** encourages an isotropic Gaussian embedding distribution to minimize downstream prediction risk.
---
## 🔧 How to Use
LuMamba weights are organized by pre-training configuration:
- **`Reconstruction-only`** → variants pre-trained with masked reconstruction exclusively
- **`LeJEPA-reconstruction`** → variants pre-trained with a balanced mixture of masked reconstruction and LeJEPA losses. Variants exist for two different LeJEPA hyperparameters: 128 and 300 projection slices.
- **`LeJEPA-only`** → variant pre-trained with LeJEPA exclusively.
All variants are pre-trained on TUEG.
LuMamba experiments are categorized by two Hydra configurations, in `BioFoundation/config/experiments`:
- **`LuMamba_finetune.yaml`** → configuration for fine-tuning experiments.
- **`LuMamba_pretrain.yaml`** → configuration for pre-training experiments.
---
## 🔧 Fine-tuning — General Checklist
0. **Install & read data prep**: clone the [BioFoundation repo](https://github.com/pulp-bio/BioFoundation), set up the environment as described there, then open `make_datasets/README.md` for dataset-specific notes (naming, expected folder layout, and common pitfalls).
1. **Point to weights**: set `pretrained_safetensors_path: /path/to/LuMamba_*.safetensors` in the experiment YAML.
2. **Preprocess data**: acquire fine-tuning dataset and follow preprocessing protocol (see guide in `/make_datasets/README.md`) to generate `train/test/val.h5` files.
3. **Update data module of `LuMamba_finetune.yaml` config**:
- **TUH datasets (TUAB/TUSL/TUAR)** → change `_target_` in `/data_module:` to `datasets.tuh_dataset.TUH_Dataset`.
- **Other** → change `/data_module:_target_` to corresponding dataset.py file in `BioFoundation/datasets` (e.g., for TDBrain dataset use `_target_:datasets.tdbrain_dataset.TDBrain_Dataset`)
- **HDF5 file location** → change `/data_module:hdf5_file` for `train`, `test`, and `val` with the path to the corresponding HDF5 data split file.
4. **Task settings**:
- **Task type**: override with `/task:finetune_task_LUNA` for classification and `/task:finetune_regression_task_LuMamba` for regression tasks
- **Classification type**: set `classification_type` (`bc`, `mcc`) and `model.num_classes` to match your downstream task. In a regression scenario,`mcc` is used and `model.num_classes` describes the number of features in the output.
- **Classifier choice**: set `/model:classifier_option` (`mamba` for FEMBA classifier, `linear` for single-layer linear classifier,`null` for default LUNA classifier)
- Configuration file includes further `#CHANGEME` tags and instructions for a working example.
5. **Env vars**: export `DATA_PATH` (dataset root) and `CHECKPOINT_DIR` (artifacts).
6. **Trainer/optimizer**: adjust `gpus/devices`, `batch_size`, `max_epochs`, LR/scheduler if needed.
7. **I/O**: set `io.base_output_path` and confirm `io.checkpoint_dirpath` exists.
To launch fine-tuning (Hydra):
```bash
python -u run_train.py +experiment=LuMamba_finetune
```
---
## ⚖️ Responsible AI, Risks & Biases
- **Clinical safety:** research-only; human oversight required.
- **Bias & drift:** montage/device/population differences can induce shifts; validate and monitor.
- **Artifacts & rare events:** robustness varies; use QC and task-appropriate preprocessing.
---
## 🔗 Sources
- **Code:** https://github.com/pulp-bio/BioFoundation
- **Paper:** LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling (arxiv:2603.19100)
---
## 📜 Citation
If you use LuMamba, please cite:
```bibtex
@misc{broustail2026lumambalatentunifiedmamba,
title={LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling},
author={Danaé Broustail and Anna Tegon and Thorir Mar Ingolfsson and Yawei Li and Luca Benini},
year={2026},
eprint={2603.19100},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2603.19100},
}
```
---
## 🛠️ Maintenance & Contact
- **Issues & support:** please open a GitHub issue in the BioFoundation repository.
---
## 🗒️ Changelog
- **v1.0:** Initial release of LuMamba model card with task-specific checkpoints and instructions.
|