| --- |
| license: mit |
| language: |
| - en |
| base_model: |
| - stable-diffusion-v1-5/stable-diffusion-v1-5 |
| pipeline_tag: image-to-image |
| tags: |
| - medical |
| - CT |
| - MRI |
| - autoencoders |
| - VAE |
| - generative-ai |
| --- |
| |
| # OpenVAE |
|
|
| OpenVAE is a medical-image VAE family for CT/MRI. |
| It provides pretrained latent backbones for diffusion models, with better anatomical fidelity than general-image VAEs. |
|
|
|  |
|
|
| ## Contribution to Generative AI |
|
|
| OpenVAE brings domain-specific latent modeling to medical generative AI, making medical diffusion pipelines more reliable, reproducible, and easier to build. |
|
|
| ## Quick Start |
|
|
| ```python |
| import torch |
| from diffusers import AutoencoderKL |
| |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| |
| # Load OpenVAE |
| vae = AutoencoderKL.from_pretrained("SMILE-project/OpenVAE", subfolder="vae").to(device) |
| vae.requires_grad_(False) |
| vae.eval() |
| |
| img = torch.randn(1, 3, 512, 512, device=device) |
| |
| with torch.no_grad(): |
| # Encode to latent space |
| latent = vae.encode(img).latent_dist.sample() |
| |
| # Decode to image space |
| reconstruction = vae.decode(latent).sample |
| ``` |
|
|
| ## Models |
|
|
| | Name | VAE Type | # Patients | |
| |--------------------------|----------|------------| |
| | stable-diffusion-v1-5 | KL-VAE | 0 | |
| | stable-diffusion-3.5-large | KL-VAE | 0 | |
| | OpenVAE-2D-4x-20K | KL-VAE | 20K | |
| | OpenVAE-2D-4x-100K | KL-VAE | 100K | |
| | OpenVAE-2D-4x-300K | KL-VAE | 300K | |
| | OpenVAE-2D-4x-PCCT_Enhanced | KL-VAE | 300K | |
| | OpenVAE-3D-4x-20K | KL-VAE | 20K | |
| | OpenVAE-3D-4x-100K | KL-VAE | 100K | |
| | OpenVAE-3D-4x-1M | KL-VAE | 1M | |
| | OpenVAE-3D-4x-100K-VQ | VQ-VAE | 100K | |
| | OpenVAE-3D-8x-100K-VQ | VQ-VAE | 100K | |
| |
| ## Benchmarks |
| |
| | Name | LPIPS | SSIM | PSNR | DSC | |
| |------|-------|------|------|---------------| |
| | stable-diffusion-v1-5 | - | - | - | - | |
| | stable-diffusion-3.5-large | - | - | - | - | |
| | OpenVAE-2D-4x-20K | - | - | - | - | |
| | OpenVAE-2D-4x-100K | - | - | - | - | |
| | OpenVAE-2D-4x-300K | - | - | - | - | |
| | OpenVAE-2D-4x-PCCT_Enhanced | - | - | - | - | |
| | OpenVAE-3D-4x-20K | - | - | - | - | |
| | OpenVAE-3D-4x-100K | - | - | - | - | |
| | OpenVAE-3D-4x-1M | - | - | - | - | |
| | OpenVAE-3D-4x-100K-VQ | - | - | - | - | |
| | OpenVAE-3D-8x-100K-VQ | - | - | - | - | |
|
|
| ## Citation |
|
|
| ``` |
| @article{liu2025see, |
| title={See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement}, |
| author={Liu, Junqi and Wu, Zejun and Bassi, Pedro RAS and Zhou, Xinze and Li, Wenxuan and Hamamci, Ibrahim E and Er, Sezgin and Lin, Tianyu and Luo, Yi and Płotka, Szymon and others}, |
| journal={arXiv preprint arXiv:https://www.arxiv.org/abs/2512.07251}, |
| year={2025}, |
| url={https://github.com/MrGiovanni/SMILE} |
| } |
| ``` |