Brain LDM autoencoder (3D KL-VAE; latent-diffusion backbone) -- Brain LDM autoencoder v1 (autoencoder.pt)
Description
The image-latent compressor sub-network of the MONAI Brain LDM release (Pinaya et al., MICCAI Workshop 2022). A 3D KL-regularized VAE that maps T1w brain MRI volumes into a 3-channel latent space at 8x spatial downsampling -- the latent representation that drives the conditional diffusion U-Net in the composite Brain LDM. Architecture (MONAI-canonical): spatial_dims=3, in/out channels=1, latent_channels=3, channel ladder (64, 128, 128, 128), 2 residual blocks per level, GroupNorm with eps=1e-6, no attention (pure ResNet). Trained on 31,740 UK Biobank T1w MRIs at 1mm isotropic, intensity-normalised to [0, 1]. Shipped as its own bundle so future 3D latent-diffusion models for medical imaging can reuse the same VAE backbone via the canonical bundle.
Intended use
Research tool / prototype. Encode a 3D T1-weighted brain MRI volume (1 mm iso, intensity in [0, 1], spatial dims multiple of 8) into a 3-channel latent representation, or decode a latent back to image space. Standalone reusable VAE backbone for future 3D latent-diffusion ports.
Usage
from ilex.models.brain_ldm_vae import BrainLDMVAE
model = BrainLDMVAE.from_pretrained('ilex-hub/brain_ldm.vae.1')
Authors
Walter H. L. Pinaya, Petru-Daniel Tudosiu, Jessica Dafflon, Pedro F. Da Costa, Virginia Fernandez, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso
Citation
Pinaya W. H. L., Tudosiu P.-D., Dafflon J., Da Costa P. F., Fernandez V., Nachev P., Ourselin S., Cardoso M. J. (2022). Brain imaging generation with latent diffusion models. MICCAI Workshop on Deep Generative Models, Springer, pp. 117-126.
References
- Pinaya W. H. L., Tudosiu P.-D., Dafflon J., Da Costa P. F., Fernandez V., Nachev P., Ourselin S., Cardoso M. J. (2022). Brain imaging generation with latent diffusion models. MICCAI Workshop on Deep Generative Models, Springer, pp. 117-126.
- Upstream bundle: huggingface.co/MONAI/brain_image_synthesis_latent_diffusion_model (autoencoder.pt; ~13.77M params).
- Architecture: monai.networks.nets.AutoencoderKL (MONAI 1.4+).
License
HF Hub license tag: apache-2.0
Upstream license reference: https://www.apache.org/licenses/LICENSE-2.0
Copyright
Network architecture and pretrained weights -- copyright (c) MONAI Consortium, released under the Apache License 2.0. JAX / Equinox port code -- copyright (c) the ilex authors, released under the Apache-2.0 / GPL-3.0 dual license used by ilex itself.
Upstream source
Original weights / reference implementation: https://huggingface.co/MONAI/brain_image_synthesis_latent_diffusion_model
Provenance
This artefact was produced by ilex's
save/load pipeline. The architecture is implemented in
ilex.models.brain_ldm_vae.BrainLDMVAE and the weights have been converted
from their upstream format. See the upstream source above
for the canonical reference.
- Downloads last month
- 23