LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling
LuMamba (Latent Unified Mamba) is an EEG foundation model built on efficient Mamba state-space learning, capable of handling heterogeneous channel topologies. LuMamba addresses varying channel layouts with LUNA channel unification, projecting a given EEG channel layout to a fixed latent topology, and overcomes the quadratic complexity of transformers with FEMBA's efficient bidirectional Mamba encoder.
π License & Usage Policy (Weights)
Weights license: The released model weights are licensed under Creative Commons AttributionβNoDerivatives 4.0 (CC BY-ND 4.0). This section summarizes the practical implications for users. This is not legal advice; please read the full license text.
β You may
- Use and redistribute the unmodified LuMamba weights (including in commercial settings) with proper attribution to the LuMamba authors.
- Fine-tune / adapt the weights for your internal use (research or production) without redistributing the modified weights.
- Publish your code, configs, logs, and papers describing experiments with LuMamba (please cite the paper).
π« You may not
- Share, host, or redistribute any modified weights (including LoRA/adapter/delta checkpoints or pruned/quantized variants). Any parameter set that encodes an adaptation is considered a derivative and cannot be shared under CC BY-ND 4.0.
- Imply endorsement by the LuMamba authors for any derivative or evaluation without our written permission.
- Use the LuMamba name in a way that suggests your modified model is an official LuMamba release.
π€ How to contribute improvements (PR-gated releases)
We welcome community improvements via a pull-request (PR) workflow. If you believe your improvements should become an official LuMamba release:
- Open a PR in the BioFoundation repository describing the change (architecture/head/training recipe, datasets, preprocessing, compute).
- Include reproducibility artifacts: configs, seeds, scripts, environment details, training/validation logs, and the evaluation protocol (e.g., TUAB/TUAR/TUSL) with exact splits.
- Provide comprehensive results (AUROC/AUPR/BA, FLOPs, memory) vs. the baselines reported in the LuMamba paper.
- After maintainer review, approved changes will be retrained/validated and, if accepted, released by the maintainers as a new official LuMamba checkpoint under CC BY-ND 4.0.
Rationale: CC BY-ND protects users from fragmented, lower-quality βLuMamba variants,β while still enabling internal fine-tuning and a path for the community to upstream improvements through review.
π Model Summary
- Goal: Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
- Core idea: Channel-Unification Module uses learned queries (Q) with cross-attention to map any set of channels to a fixed latent space. bidirectional Mamba blocks then operate on that latent sequence.
- Pre-training data: TUEG, >21,000 hours of raw EEG; downstream subjects removed to avoid leakage.
- Downstream tasks: TUAB (abnormal), TUAR (artifacts), TUSL (slowing), SEED-V (emotion; unseen 62-ch montage), APAVA (Alzheimer's disease; unseen 16-ch layout, TDBrain (Parkinson's disease; unseen 26-ch layout)
π Model Variants
The model currently exists in a Tiny Variant, with the following parameters:
| Variant | Parameters | FEMBA parameters | LUNA parameters |
|---|---|---|---|
| LuMamba_tiny | 4.1M | (num_blocks = 2, exp = 2) |
(num_queries = 6, embed_dim = 64) |
Larger model sizes can be attained by increasing the number of bi-Mamba blocks num_blocks (e.g. 8 bi-Mamba blocks yields 15M parameters).
π Results (Highlights)
- TUAB (abnormal vs normal): 80.99 % Bal. Acc., 0.883 AUROC, 0.892 AUPR. (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- APAVA (Alzheimer's detection): 0.955 AUROC, 0.970 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- TDBrain (Parkinson's detection): 0.961 AUROC, 0.960 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
Efficiency: Up to 377Γ fewer FLOPs relative to transformer-based baselines and supporting up to 500x longer EEG windows, thanks to the efficient FEMBA bi-Mamba encoder.
π§ Intended Use & Limitations
Intended use. Research on EEG representation learning & classification (abnormality, artifacts, slowing, emotion), especially when montages vary or channel counts are high.
Limitations.
- Not a medical device. Do not use for clinical decisions without proper validation & regulatory clearance.
- Unseen topologies: Zero-shot transfer to very different/dense layouts (e.g., SEED-V) can underperform SOTA despite positive scaling; consider augmenting pre-training montage diversity and spatial encodings.
- Distribution shifts: Performance varies across cohorts, devices, and label protocols; validate locally and consider domain adaptation.
ποΈ Architecture & Training
LUNA Tokenizer & features. EEG is patch-segmented; temporal features via 1D conv w/ GroupNorm+GELU; frequency features (FFT mag/phase β MLP) are added; 3D electrode coordinates encoded via NeRF-style sinusoids β MLP (positional enc).
LUNA Channel-Unification Module. Q learned queries cross-attend to channel-wise patch features to produce a fixed QΓE latent per patch; FFN + Transformer layers refine the query tokens. Complexity is O(QΒ·C) (linear in channels).
FEMBA Bi-Mamba Temporal encoder. Mamba blocks process the embeddings in separate forward and backward streams.
Pre-training objectives. Masked-patch reconstruction is used to reconstruct masked tokens. In parallel, the LeJEPA loss encourages an isotropic Gaussian embedding distribution to minimize downstream prediction risk.
π§ How to Use
LuMamba weights are organized by pre-training configuration:
Reconstruction-onlyβ variants pre-trained with masked reconstruction exclusivelyLeJEPA-reconstructionβ variants pre-trained with a balanced mixture of masked reconstruction and LeJEPA losses. Variants exist for two different LeJEPA hyperparameters: 128 and 300 projection slices.LeJEPA-onlyβ variant pre-trained with LeJEPA exclusively.
All variants are pre-trained on TUEG.
LuMamba experiments are categorized by two Hydra configurations, in BioFoundation/config/experiments:
LuMamba_finetune.yamlβ configuration for fine-tuning experiments.LuMamba_pretrain.yamlβ configuration for pre-training experiments.
π§ Fine-tuning β General Checklist
- Install & read data prep: clone the BioFoundation repo, set up the environment as described there, then open
make_datasets/README.mdfor dataset-specific notes (naming, expected folder layout, and common pitfalls). - Point to weights: set
pretrained_safetensors_path: /path/to/LuMamba_*.safetensorsin the experiment YAML. - Preprocess data: acquire fine-tuning dataset and follow preprocessing protocol (see guide in
/make_datasets/README.md) to generatetrain/test/val.h5files. - Update data module of
LuMamba_finetune.yamlconfig:- TUH datasets (TUAB/TUSL/TUAR) β change
_target_in/data_module:todatasets.tuh_dataset.TUH_Dataset. - Other β change
/data_module:_target_to corresponding dataset.py file inBioFoundation/datasets(e.g., for TDBrain dataset use_target_:datasets.tdbrain_dataset.TDBrain_Dataset) - HDF5 file location β change
/data_module:hdf5_filefortrain,test, andvalwith the path to the corresponding HDF5 data split file.
- TUH datasets (TUAB/TUSL/TUAR) β change
- Task settings:
- Task type: override with
/task:finetune_task_LUNAfor classification and/task:finetune_regression_task_LuMambafor regression tasks - Classification type: set
classification_type(bc,mcc) andmodel.num_classesto match your downstream task. In a regression scenario,mccis used andmodel.num_classesdescribes the number of features in the output. - Classifier choice: set
/model:classifier_option(mambafor FEMBA classifier,linearfor single-layer linear classifier,nullfor default LUNA classifier) - Configuration file includes further
#CHANGEMEtags and instructions for a working example.
- Task type: override with
- Env vars: export
DATA_PATH(dataset root) andCHECKPOINT_DIR(artifacts). - Trainer/optimizer: adjust
gpus/devices,batch_size,max_epochs, LR/scheduler if needed. - I/O: set
io.base_output_pathand confirmio.checkpoint_dirpathexists.
To launch fine-tuning (Hydra):
python -u run_train.py +experiment=LuMamba_finetune
βοΈ Responsible AI, Risks & Biases
- Clinical safety: research-only; human oversight required.
- Bias & drift: montage/device/population differences can induce shifts; validate and monitor.
- Artifacts & rare events: robustness varies; use QC and task-appropriate preprocessing.
π Sources
- Code: https://github.com/pulp-bio/BioFoundation
- Paper: LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling (arxiv:2603.19100)
π Citation
If you use LuMamba, please cite:
@misc{broustail2026lumambalatentunifiedmamba,
title={LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling},
author={DanaΓ© Broustail and Anna Tegon and Thorir Mar Ingolfsson and Yawei Li and Luca Benini},
year={2026},
eprint={2603.19100},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2603.19100},
}
π οΈ Maintenance & Contact
- Issues & support: please open a GitHub issue in the BioFoundation repository.
ποΈ Changelog
- v1.0: Initial release of LuMamba model card with task-specific checkpoints and instructions.
- Downloads last month
- 12