LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling

LuMamba (Latent Unified Mamba) is an EEG foundation model built on efficient Mamba state-space learning, capable of handling heterogeneous channel topologies. LuMamba addresses varying channel layouts with LUNA channel unification, projecting a given EEG channel layout to a fixed latent topology, and overcomes the quadratic complexity of transformers with FEMBA's efficient bidirectional Mamba encoder.

🔒 License & Usage Policy (Weights)

Weights license: The released model weights are licensed under Creative Commons Attribution–NoDerivatives 4.0 (CC BY-ND 4.0). This section summarizes the practical implications for users. This is not legal advice; please read the full license text.

✅ You may

Use and redistribute the unmodified LuMamba weights (including in commercial settings) with proper attribution to the LuMamba authors.
Fine-tune / adapt the weights for your internal use (research or production) without redistributing the modified weights.
Publish your code, configs, logs, and papers describing experiments with LuMamba (please cite the paper).

🚫 You may not

Share, host, or redistribute any modified weights (including LoRA/adapter/delta checkpoints or pruned/quantized variants). Any parameter set that encodes an adaptation is considered a derivative and cannot be shared under CC BY-ND 4.0.
Imply endorsement by the LuMamba authors for any derivative or evaluation without our written permission.
Use the LuMamba name in a way that suggests your modified model is an official LuMamba release.

🤝 How to contribute improvements (PR-gated releases)

We welcome community improvements via a pull-request (PR) workflow. If you believe your improvements should become an official LuMamba release:

Open a PR in the BioFoundation repository describing the change (architecture/head/training recipe, datasets, preprocessing, compute).
Include reproducibility artifacts: configs, seeds, scripts, environment details, training/validation logs, and the evaluation protocol (e.g., TUAB/TUAR/TUSL) with exact splits.
Provide comprehensive results (AUROC/AUPR/BA, FLOPs, memory) vs. the baselines reported in the LuMamba paper.
After maintainer review, approved changes will be retrained/validated and, if accepted, released by the maintainers as a new official LuMamba checkpoint under CC BY-ND 4.0.

Rationale: CC BY-ND protects users from fragmented, lower-quality “LuMamba variants,” while still enabling internal fine-tuning and a path for the community to upstream improvements through review.

🔎 Model Summary

Goal: Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
Core idea: Channel-Unification Module uses learned queries (Q) with cross-attention to map any set of channels to a fixed latent space. bidirectional Mamba blocks then operate on that latent sequence.
Pre-training data: TUEG, >21,000 hours of raw EEG; downstream subjects removed to avoid leakage.
Downstream tasks: TUAB (abnormal), TUAR (artifacts), TUSL (slowing), SEED-V (emotion; unseen 62-ch montage), APAVA (Alzheimer's disease; unseen 16-ch layout, TDBrain (Parkinson's disease; unseen 26-ch layout)

🚀 Model Variants

The model currently exists in a Tiny Variant, with the following parameters:

Variant	Parameters	FEMBA parameters	LUNA parameters
LuMamba_tiny	4.1M	(`num_blocks` = 2, `exp` = 2)	(`num_queries` = 6, `embed_dim` = 64)

Larger model sizes can be attained by increasing the number of bi-Mamba blocks num_blocks (e.g. 8 bi-Mamba blocks yields 15M parameters).

📊 Results

TUAB (abnormal vs normal): 80.99 % Bal. Acc., 0.883 AUROC, 0.892 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
TUSL (slowing event VS. seizure detection): 0.708 AUROC, 0.289 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only).
TUAR (artifact detection): 0.914 AUROC, 0.510 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only).
APAVA (Alzheimer's detection): 0.955 AUROC, 0.970 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
TDBrain (Parkinson's detection): 0.961 AUROC, 0.960 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
Mumtaz2016 (Depression detection): 0.725 Bal. Acc., 0.931 AUROC, 0.952 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
SEED-V (5-class emotion detection): 0.350 Bal. Acc., 0.191 Cohen's Kappa (LuMamba-Tiny, pre-trained with reconstruction-only).
MoBI (gait prediction): 0.116 R-squared, 0.148 RMSE (LuMamba-Tiny, pre-trained with reconstruction-only).
MODMA (full 128-channel set): 59.47 % Bal. Acc., 0.448 AUROC, 0.420 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only)
MODMA (reduced 13-channel subset): 59.09 % Bal. Acc., 0.522 AUROC, 0.4153 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).

Efficiency: Up to 377× fewer FLOPs relative to transformer-based baselines and supporting up to 500x longer EEG windows, thanks to the efficient FEMBA bi-Mamba encoder.

🧠 Intended Use & Limitations

Intended use. Research on EEG representation learning & classification (abnormality, artifacts, slowing, emotion), especially when montages vary or channel counts are high.

Limitations.

Not a medical device. Do not use for clinical decisions without proper validation & regulatory clearance.
Unseen topologies: Zero-shot transfer to very different/dense layouts (e.g., SEED-V) can underperform SOTA despite positive scaling; consider augmenting pre-training montage diversity and spatial encodings.
Distribution shifts: Performance varies across cohorts, devices, and label protocols; validate locally and consider domain adaptation.

🏗️ Architecture & Training

LUNA Tokenizer & features. EEG is patch-segmented; temporal features via 1D conv w/ GroupNorm+GELU; frequency features (FFT mag/phase → MLP) are added; 3D electrode coordinates encoded via NeRF-style sinusoids → MLP (positional enc).

LUNA Channel-Unification Module. Q learned queries cross-attend to channel-wise patch features to produce a fixed Q×E latent per patch; FFN + Transformer layers refine the query tokens. Complexity is O(Q·C) (linear in channels).

FEMBA Bi-Mamba Temporal encoder. Mamba blocks process the embeddings in separate forward and backward streams.

Pre-training objectives. Masked-patch reconstruction is used to reconstruct masked tokens. In parallel, the LeJEPA loss encourages an isotropic Gaussian embedding distribution to minimize downstream prediction risk.

🔧 How to Use

LuMamba weights are organized by pre-training configuration:

Reconstruction-only → variants pre-trained with masked reconstruction exclusively
LeJEPA-reconstruction → variants pre-trained with a balanced mixture of masked reconstruction and LeJEPA losses. Variants exist for two different LeJEPA hyperparameters: 128 and 300 projection slices.
LeJEPA-only → variant pre-trained with LeJEPA exclusively.

All variants are pre-trained on TUEG.

LuMamba experiments are categorized by two Hydra configurations, in BioFoundation/config/experiments:

LuMamba_finetune.yaml → configuration for fine-tuning experiments.
LuMamba_pretrain.yaml → configuration for pre-training experiments.

🔧 Fine-tuning — General Checklist

Install & read data prep: clone the BioFoundation repo, set up the environment as described there, then open make_datasets/README.md for dataset-specific notes (naming, expected folder layout, and common pitfalls).
Point to weights: set pretrained_safetensors_path: /path/to/LuMamba_*.safetensors in the experiment YAML.
Preprocess data: acquire fine-tuning dataset and follow preprocessing protocol (see guide in /make_datasets/README.md) to generate train/test/val.h5 files.
Update data module of LuMamba_finetune.yaml config:
- TUH datasets (TUAB/TUSL/TUAR) → change _target_ in /data_module: to datasets.tuh_dataset.TUH_Dataset.
- Other → change /data_module:_target_ to corresponding dataset.py file in BioFoundation/datasets (e.g., for TDBrain dataset use _target_:datasets.tdbrain_dataset.TDBrain_Dataset)
- HDF5 file location → change /data_module:hdf5_file for train, test, and val with the path to the corresponding HDF5 data split file.
Task settings:
- Task type: override with /task:finetune_task_LUNA for classification and /task:finetune_regression_task_LuMamba for regression tasks
- Classification type: set classification_type (bc, mcc) and model.num_classes to match your downstream task. In a regression scenario,mcc is used and model.num_classes describes the number of features in the output.
- Classifier choice: set /model:classifier_option (mamba for FEMBA classifier, linear for single-layer linear classifier,null for default LUNA classifier)
- Configuration file includes further #CHANGEME tags and instructions for a working example.
Env vars: export DATA_PATH (dataset root) and CHECKPOINT_DIR (artifacts).
Trainer/optimizer: adjust gpus/devices, batch_size, max_epochs, LR/scheduler if needed.
I/O: set io.base_output_path and confirm io.checkpoint_dirpath exists.

To launch fine-tuning (Hydra):

python -u run_train.py +experiment=LuMamba_finetune

⚖️ Responsible AI, Risks & Biases

Clinical safety: research-only; human oversight required.
Bias & drift: montage/device/population differences can induce shifts; validate and monitor.
Artifacts & rare events: robustness varies; use QC and task-appropriate preprocessing.

🔗 Sources

Code: https://github.com/pulp-bio/BioFoundation
Paper: LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling (arxiv:2603.19100)

📜 Citation

If you use LuMamba, please cite:

@misc{broustail2026lumambalatentunifiedmamba,
      title={LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling}, 
      author={Danaé Broustail and Anna Tegon and Thorir Mar Ingolfsson and Yawei Li and Luca Benini},
      year={2026},
      eprint={2603.19100},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2603.19100}, 
}

🛠️ Maintenance & Contact

Issues & support: please open a GitHub issue in the BioFoundation repository.

🔗 Related Models

LUNA — Transformer-based topology-agnostic EEG foundation model (NeurIPS 2025). Source of the channel-unification cross-attention module that LuMamba reuses.
FEMBA — Bidirectional Mamba foundation model for EEG. Source of the linear-complexity temporal backbone that LuMamba reuses.
TinyMyo — Tiny foundation model for flexible EMG signal processing at the edge.

🗒️ Changelog

v1.0: Initial release of LuMamba model card with task-specific checkpoints and instructions.

Downloads last month: 19

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including PulpBio/LuMamba

PulpBio Biosignal Foundation Models

Collection

9 items • Updated 14 days ago • 1

Paper for PulpBio/LuMamba

LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling

Paper • 2603.19100 • Published Mar 19 • 1

Evaluation results

Balanced Accuracy (%) on TUH EEG Abnormal Corpus (TUAB)
self-reported

80.990
AUROC on TUH EEG Abnormal Corpus (TUAB)
self-reported

0.883
AUC-PR on TUH EEG Abnormal Corpus (TUAB)
self-reported

0.892
AUROC on APAVA
self-reported

0.955
AUC-PR on APAVA
self-reported

0.970
AUROC on TDBrain
self-reported

0.961
AUC-PR on TDBrain
self-reported

0.960
AUROC on Mumtaz2016
self-reported

0.931