Papers
arxiv:2606.19651

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Published on Jun 17
· Submitted by
Max Van Puyvelde
on Jun 22
Authors:
,
,

Abstract

A 3D brain MRI generative model uses a masked-autoencoder tokenizer to create clinically informative embeddings that support both medical task performance and controlled image generation.

Three-dimensional (3D) brain MRI is central to clinical neurology and neuro-oncology, where generative models could augment under-represented cohorts, simulate disease trajectories, and support privacy-preserving data sharing. Latent diffusion has been the go-to solution for modeling imaging data, but it places two competing demands on the tokenizer: encoder embeddings must retain the clinical information that downstream tasks act on, and the decoder must reconstruct anatomically faithful volumes. Existing reconstruction-driven tokenizers achieve the second at the expense of the first. To address this, we introduce a fully volumetric masked-autoencoder (MAE) based tokenizer for 3D brain MRI latent diffusion, decoupling encoder and decoder: a frozen 3D MAE encoder produces clinically informative embeddings, while a dedicated CNN decoder reconstructs voxels from a linear projection of those embeddings. We pretrain the encoder on 35,309 volumes from 18 public cohorts spanning four modalities, ten disease categories, and 200+ acquisition sites, and demonstrate its dual utility in two settings. First, on a 23-task linear-probing benchmark, the encoder outperforms or matches SOTA models (i.e., BrainIAC, BrainSegFounder, and MedicalNet) on 21 of 23 tasks. Second, a conditional diffusion transformer (DiT) trained on these clinically informative embeddings supports both conditional generation across six variables and patient-specific longitudinal forecasting. Together these results establish a single 3D brain-MRI embedding space capable of both downstream clinical tasks and controllable generation.

Community

Paper author Paper submitter

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

BrainG3N is a controllable generator for 3D brain MRI built on top of a strong self-supervised foundation encoder. A frozen 3D MAE encoder (pretrained on 35,309 volumes across 18 cohorts, 4 modalities, 200+ sites) produces clinically informative embeddings; a conditional flow-matching DiT then generates new scans directly in that space, and a fine-tuned CNN decoder maps them back to voxels.

Because generation happens in a clinically grounded latent space, the synthetic scans actually carry the conditioned attributes:

  • Controllable along disease, age, sex, modality, site, and IDH1 status, plus patient-specific longitudinal forecasting.
  • Real-data clinical probes recover the requested attribute from generated scans (age Pearson r=0.93).
  • The same frozen encoder is a strong foundation model in its own right — beats/matches BrainIAC, BrainSegFounder, and MedicalNet on 21/23 linear-probing tasks (IDH1 AUC 0.937, brain-age MAE 4.43 y), with no fine-tuning.

Useful for augmenting under-represented cohorts, counterfactual disease trajectories ("what would this patient look like under disease X"), and privacy-preserving synthetic data sharing.

📄 https://arxiv.org/abs/2606.19651 — model, code, and synthetic dataset coming soon.

Feel free to reach out! 🤗

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.19651
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.19651 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.19651 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.19651 in a Space README.md to link it from this page.

Collections including this paper 1