Brain LDM autoencoder (3D KL-VAE; latent-diffusion backbone) -- Brain LDM autoencoder v1 (autoencoder.pt)

Description

The image-latent compressor sub-network of the MONAI Brain LDM release (Pinaya et al., MICCAI Workshop 2022). A 3D KL-regularized VAE that maps T1w brain MRI volumes into a 3-channel latent space at 8x spatial downsampling -- the latent representation that drives the conditional diffusion U-Net in the composite Brain LDM. Architecture (MONAI-canonical): spatial_dims=3, in/out channels=1, latent_channels=3, channel ladder (64, 128, 128, 128), 2 residual blocks per level, GroupNorm with eps=1e-6, no attention (pure ResNet). Trained on 31,740 UK Biobank T1w MRIs at 1mm isotropic, intensity-normalised to [0, 1]. Shipped as its own bundle so future 3D latent-diffusion models for medical imaging can reuse the same VAE backbone via the canonical bundle.

Intended use

Research tool / prototype. Encode a 3D T1-weighted brain MRI volume (1 mm iso, intensity in [0, 1], spatial dims multiple of 8) into a 3-channel latent representation, or decode a latent back to image space. Standalone reusable VAE backbone for future 3D latent-diffusion ports.

Usage

from ilex.models.brain_ldm_vae import BrainLDMVAE
model = BrainLDMVAE.from_pretrained('ilex-hub/brain_ldm.vae.1')

Authors

Walter H. L. Pinaya, Petru-Daniel Tudosiu, Jessica Dafflon, Pedro F. Da Costa, Virginia Fernandez, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Citation

Pinaya W. H. L., Tudosiu P.-D., Dafflon J., Da Costa P. F., Fernandez V., Nachev P., Ourselin S., Cardoso M. J. (2022). Brain imaging generation with latent diffusion models. MICCAI Workshop on Deep Generative Models, Springer, pp. 117-126.

References

Pinaya W. H. L., Tudosiu P.-D., Dafflon J., Da Costa P. F., Fernandez V., Nachev P., Ourselin S., Cardoso M. J. (2022). Brain imaging generation with latent diffusion models. MICCAI Workshop on Deep Generative Models, Springer, pp. 117-126.
Upstream bundle: huggingface.co/MONAI/brain_image_synthesis_latent_diffusion_model (autoencoder.pt; ~13.77M params).
Architecture: monai.networks.nets.AutoencoderKL (MONAI 1.4+).

License

HF Hub license tag: apache-2.0

Upstream license reference: https://www.apache.org/licenses/LICENSE-2.0

Copyright

Network architecture and pretrained weights -- copyright (c) MONAI Consortium, released under the Apache License 2.0. JAX / Equinox port code -- copyright (c) the ilex authors, released under the Apache-2.0 / GPL-3.0 dual license used by ilex itself.

Upstream source

Original weights / reference implementation: https://huggingface.co/MONAI/brain_image_synthesis_latent_diffusion_model

Provenance

This artefact was produced by ilex's save/load pipeline. The architecture is implemented in ilex.models.brain_ldm_vae.BrainLDMVAE and the weights have been converted from their upstream format. See the upstream source above for the canonical reference.

Downloads last month: 23

Safetensors

Model size

13.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support