--- license: apache-2.0 tags: - vae - autoencoder - image - stable-diffusion - sdxl - flux - sana - qwen pipeline_tag: image-to-image library_name: diffusers language: - en --- # VAEs for Image Generation This repository hosts a curated collection of VAE checkpoints used by diffusion and transformer-based image generation pipelines. ## Available VAEs ### AutoencoderKL | Model | Source | Latent Channels | |-------|--------|-----------------| | SD21-VAE | Stable Diffusion 2.1 | 4 | | SDXL-VAE | Stable Diffusion XL | 4 | | SD35-VAE | Stable Diffusion 3.5 | 16 | | FLUX1-VAE | FLUX.1 | 16 | | FLUX2-VAE | FLUX.2 | 32 | | SANA-VAE | SANA (DC-AE) | 32 | | Qwen-VAE | Qwen-Image | 16 | ### VQModel | Model | Source | latent_channels | num_vq_embeddings | vq_embed_dim | sample_size | |-------|--------|-----------------|-------------------|--------------|-------------| | VQDIFFUSION-VQVAE | VQ-Diffusion (microsoft/vq-diffusion-ithq) | 256 | 4096 | 128 | 32 | | IBQ-VQVAE-1024 | IBQ (TencentARC/SEED) | 256 | 1024 | 256 | 32 | | IBQ-VQVAE-8192 | IBQ (TencentARC/SEED) | 256 | 8192 | 256 | 32 | | IBQ-VQVAE-16384 | IBQ (TencentARC/SEED) | 256 | 16384 | 256 | 32 | | IBQ-VQVAE-262144 | IBQ (TencentARC/SEED) | 256 | 262144 | 256 | 32 | | MOVQGAN-67M | MOVQGAN | 4 | 16384 | 4 | 256 | | MOVQGAN-102M | MOVQGAN | 4 | 16384 | 4 | 256 | | MOVQGAN-270M | MOVQGAN | 4 | 16384 | 4 | 256 | ## Diffusers usage **AutoencoderKL** (SD, FLUX, SANA, Qwen, etc.): ```python from diffusers import AutoencoderKL vae = AutoencoderKL.from_pretrained( "BiliSakura/VAEs", subfolder="SDXL-VAE", ) ``` **VQModel** (VQ-Diffusion, IBQ, MOVQGAN): ```python from diffusers import VQModel vae = VQModel.from_pretrained( "BiliSakura/VAEs", subfolder="VQDIFFUSION-VQVAE", ) ``` ## Notes - All models are VAE checkpoints intended for inference use in their corresponding pipelines. - Latent channel count is listed to help match with the correct backbone.