SMILE-project
/

OpenVAE

Model card Files Files and versions

OpenVAE / README.md

MitakaKuma's picture

Update README.md

4a662c7 verified 1 day ago

|

history blame contribute delete

2.65 kB

	---
	license: mit
	language:
	- en
	base_model:
	- stable-diffusion-v1-5/stable-diffusion-v1-5
	pipeline_tag: image-to-image
	tags:
	- medical
	- CT
	- MRI
	- autoencoders
	- VAE
	- generative-ai
	---

	# OpenVAE

	OpenVAE is a medical-image VAE family for CT/MRI.
	It provides pretrained latent backbones for diffusion models, with better anatomical fidelity than general-image VAEs.

	![openvae](./docs/OpenVAE.png)

	## Contribution to Generative AI

	OpenVAE brings domain-specific latent modeling to medical generative AI, making medical diffusion pipelines more reliable, reproducible, and easier to build.

	## Quick Start

	```python
	import torch
	from diffusers import AutoencoderKL

	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	# Load OpenVAE
	vae = AutoencoderKL.from_pretrained("SMILE-project/OpenVAE", subfolder="vae").to(device)
	vae.requires_grad_(False)
	vae.eval()

	img = torch.randn(1, 3, 512, 512, device=device)

	with torch.no_grad():
	# Encode to latent space
	latent = vae.encode(img).latent_dist.sample()

	# Decode to image space
	reconstruction = vae.decode(latent).sample
	```

	## Models

	\| Name \| VAE Type \| # Patients \|
	\|--------------------------\|----------\|------------\|
	\| stable-diffusion-v1-5 \| KL-VAE \| 0 \|
	\| stable-diffusion-3.5-large \| KL-VAE \| 0 \|
	\| OpenVAE-2D-4x-20K \| KL-VAE \| 20K \|
	\| OpenVAE-2D-4x-100K \| KL-VAE \| 100K \|
	\| OpenVAE-2D-4x-300K \| KL-VAE \| 300K \|
	\| OpenVAE-2D-4x-PCCT_Enhanced \| KL-VAE \| 300K \|
	\| OpenVAE-3D-4x-20K \| KL-VAE \| 20K \|
	\| OpenVAE-3D-4x-100K \| KL-VAE \| 100K \|
	\| OpenVAE-3D-4x-1M \| KL-VAE \| 1M \|
	\| OpenVAE-3D-4x-100K-VQ \| VQ-VAE \| 100K \|
	\| OpenVAE-3D-8x-100K-VQ \| VQ-VAE \| 100K \|

	## Benchmarks

	\| Name \| LPIPS \| SSIM \| PSNR \| DSC \|
	\|------\|-------\|------\|------\|---------------\|
	\| stable-diffusion-v1-5 \| - \| - \| - \| - \|
	\| stable-diffusion-3.5-large \| - \| - \| - \| - \|
	\| OpenVAE-2D-4x-20K \| - \| - \| - \| - \|
	\| OpenVAE-2D-4x-100K \| - \| - \| - \| - \|
	\| OpenVAE-2D-4x-300K \| - \| - \| - \| - \|
	\| OpenVAE-2D-4x-PCCT_Enhanced \| - \| - \| - \| - \|
	\| OpenVAE-3D-4x-20K \| - \| - \| - \| - \|
	\| OpenVAE-3D-4x-100K \| - \| - \| - \| - \|
	\| OpenVAE-3D-4x-1M \| - \| - \| - \| - \|
	\| OpenVAE-3D-4x-100K-VQ \| - \| - \| - \| - \|
	\| OpenVAE-3D-8x-100K-VQ \| - \| - \| - \| - \|

	## Citation

	```
	@article{liu2025see,
	title={See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement},
	author={Liu, Junqi and Wu, Zejun and Bassi, Pedro RAS and Zhou, Xinze and Li, Wenxuan and Hamamci, Ibrahim E and Er, Sezgin and Lin, Tianyu and Luo, Yi and Płotka, Szymon and others},
	journal={arXiv preprint arXiv:https://www.arxiv.org/abs/2512.07251},
	year={2025},
	url={https://github.com/MrGiovanni/SMILE}
	}
	```