dc-ai
/

dc-ae-lite-f32c32-diffusers

Model card Files Files and versions

dc-ae-lite-f32c32-diffusers / README.md

strangerTHU's picture

Upload folder using huggingface_hub

63d9ac9 verified 4 months ago

|

history blame contribute delete

1.4 kB

	# DC-AE-Lite
	\[[github](https://github.com/dc-ai-projects/DC-Gen/tree/main)\]

	Decoding is often the speed bottleneck in few-step latent diffusion models. We release DC-AE-Lite to resolve this problem. It has the same encoder of DC-AE-f32c32-SANA-1.0 while having a much smaller decoder. Without training, it can be applied to diffusion model trained with DC-AE-f32c32-SANA-1.0.

	## Demo
	<p align="center">
	<img src="./assets/combined.gif"><br>
	<b> DC-AE-Lite vs DC-AE reconstruction visual quality </b>
	</p>

	<p align="center">
	<img src="./assets/dc-ae-lite.jpg"><br>
	<b> DC-AE-Lite achieves 1.8× faster decoding than DC-AE with similar reconstruction quality </b>
	</p>



	# Usage
	```bash
	from diffusers import AutoencoderDC
	from PIL import Image
	import torch
	import torchvision.transforms as transforms
	from torchvision.utils import save_image

	device = torch.device("cuda")
	dc_ae_lite = AutoencoderDC.from_pretrained("dc-ai/dc-ae-lite-f32c32-diffusers").to(device).eval()

	transform = transforms.Compose([
	transforms.CenterCrop((1024,1024)),
	transforms.ToTensor(),
	transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
	])

	image = Image.open("assets/fig/girl.png")

	x = transform(image)[None].to(device)
	latent = dc_ae_lite.encode(x).latent
	print(f"latent shape: {latent.shape}")

	y = dc_ae_lite.decode(latent).sample
	save_image(y * 0.5 + 0.5, "demo_dc_ae_lite.png")
	```