MIRAGE: Adaptive Multimodal Gating for Whole-Brain fMRI Encoding

arXiv Project Page GitHub

MIRAGE (Multimodal Integration with Representation-Adaptive Gated Encoding) is a brain encoder trained on the Algonauts 2025 challenge dataset. It takes multimodal hidden-state features extracted from Qwen3-Omni-30B-A3B-Thinking and predicts BOLD fMRI responses in 1,000 cortical parcels at 100 TRs per window.

Model Description

Video / Audio / Transcript
  -> Qwen3-Omni-30B-A3B-Thinking hidden states
  -> per-modality layer pooler (24 learned queries)
  -> linear projectors
  -> temporal Transformer (8 layers, 8 heads, hidden dim 3072)
  -> subject_linear readout
  -> 100 TRs x 1,000 parcels

Training

Hyperparameter Value
Dataset Algonauts 2025 (Friends TV + Movie10)
Subjects sub-01, sub-02, sub-03, sub-05
Val split Friends season 6 hold-out
Epochs 15
Batch size 16
Optimizer AdamW (lr=0.0001)
LR schedule OneCycleLR
Mixed precision 16-mixed

Evaluation

MIRAGE results on the Algonauts 2025 CNeuroMod splits. Values are mean Pearson r across the four trained subjects. Friends s06 is the held-out validation split used during development; Friends s07 is the held-out in-distribution benchmark; OOD is the held-out movie benchmark.

Model Friends s06 eval Friends s07 held-out in-dist eval OOD eval Notes
MIRAGE single model 0.319 0.310 0.217 Hugging Face checkpoint
MIRAGE 15-member ensemble 0.335 0.323 0.227 Algonauts 2025 final submission ensemble

Per-subject Pearson r on the OOD test set:

Subject Pearson r
sub-01 0.244
sub-02 0.210
sub-03 0.235
sub-05 0.179

Usage

git clone https://github.com/epflneuroailab/mirage
cd mirage
pip install -e .

python -m brain_enc.cli.infer_fmri \
  --video /path/to/video.mp4 \
  --transcript /path/to/transcript.json \
  --run-dir /path/to/downloaded/hf/files \
  --subject-idx 0 \
  --output fmri_predictions.npy

For direct loading, download model.safetensors and config.yaml from epfl-neuroai/mirage, build the configured brain_enc model, and load weights with load_model_state(model, "model.safetensors").

Limitations

  • Predictions are conditioned on one of the four trained Algonauts 2025 subjects.
  • Performance is expected to be strongest on Friends-style narrative video.
  • Full raw-video extraction requires the Qwen3-Omni feature backbone and a large GPU.

Citation

@misc{gokce2026mirageadaptivemultimodalgating,
  title = {MIRAGE: Adaptive Multimodal Gating for Whole-Brain fMRI Encoding},
  author = {Gokce, Abdulkadir and AlKhamissi, Badr and Schrimpf, Martin},
  year = {2026},
  eprint = {2605.29850},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG},
  url = {https://arxiv.org/abs/2605.29850}
}
Downloads last month
6
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for epfl-neuroai/mirage