MIRAGE: Adaptive Multimodal Gating for Whole-Brain fMRI Encoding

MIRAGE (Multimodal Integration with Representation-Adaptive Gated Encoding) is a brain encoder trained on the Algonauts 2025 challenge dataset. It takes multimodal hidden-state features extracted from Qwen3-Omni-30B-A3B-Thinking and predicts BOLD fMRI responses in 1,000 cortical parcels at 100 TRs per window.

Model Description

Video / Audio / Transcript
  -> Qwen3-Omni-30B-A3B-Thinking hidden states
  -> per-modality layer pooler (24 learned queries)
  -> linear projectors
  -> temporal Transformer (8 layers, 8 heads, hidden dim 3072)
  -> subject_linear readout
  -> 100 TRs x 1,000 parcels

Training

Hyperparameter	Value
Dataset	Algonauts 2025 (Friends TV + Movie10)
Subjects	sub-01, sub-02, sub-03, sub-05
Val split	Friends season 6 hold-out
Epochs	15
Batch size	16
Optimizer	AdamW (lr=0.0001)
LR schedule	OneCycleLR
Mixed precision	16-mixed

Evaluation

MIRAGE results on the Algonauts 2025 CNeuroMod splits. Values are mean Pearson r across the four trained subjects. Friends s06 is the held-out validation split used during development; Friends s07 is the held-out in-distribution benchmark; OOD is the held-out movie benchmark.

Model	Friends s06 eval	Friends s07 held-out in-dist eval	OOD eval	Notes
MIRAGE single model	0.319	0.310	0.217	Hugging Face checkpoint
MIRAGE 15-member ensemble	0.335	0.323	0.227	Algonauts 2025 final submission ensemble

Per-subject Pearson r on the OOD test set:

Subject	Pearson r
sub-01	0.244
sub-02	0.210
sub-03	0.235
sub-05	0.179

Usage

git clone https://github.com/epflneuroailab/mirage
cd mirage
pip install -e .

python -m brain_enc.cli.infer_fmri \
  --video /path/to/video.mp4 \
  --transcript /path/to/transcript.json \
  --run-dir /path/to/downloaded/hf/files \
  --subject-idx 0 \
  --output fmri_predictions.npy

For direct loading, download model.safetensors and config.yaml from epfl-neuroai/mirage, build the configured brain_enc model, and load weights with load_model_state(model, "model.safetensors").

Limitations

Predictions are conditioned on one of the four trained Algonauts 2025 subjects.
Performance is expected to be strongest on Friends-style narrative video.
Full raw-video extraction requires the Qwen3-Omni feature backbone and a large GPU.

Citation

@misc{gokce2026mirage,
  title = {MIRAGE: Adaptive Multimodal Gating for Whole-Brain fMRI Encoding},
  author = {Gokce, Abdulkadir and AlKhamissi, Badr and Schrimpf, Martin},
  year = {2026},
  eprint = {2605.29850},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG},
  url = {https://arxiv.org/abs/2605.29850}
}

Downloads last month: 29

Safetensors

Model size

1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for epfl-neuroai/mirage

MIRAGE: Adaptive Multimodal Gating for Whole-Brain fMRI Encoding

Paper • 2605.29850 • Published May 28