| | --- |
| | license: apache-2.0 |
| | tags: |
| | - vae |
| | - autoencoder |
| | - image |
| | - stable-diffusion |
| | - sdxl |
| | - flux |
| | - sana |
| | - qwen |
| | pipeline_tag: image-to-image |
| | library_name: diffusers |
| | language: |
| | - en |
| | --- |
| | |
| | # VAEs for Image Generation |
| |
|
| | This repository hosts a curated collection of VAE checkpoints used by diffusion |
| | and transformer-based image generation pipelines. |
| |
|
| | ## Available VAEs |
| |
|
| | ### AutoencoderKL |
| |
|
| | | Model | Source | Latent Channels | |
| | |-------|--------|-----------------| |
| | | SD21-VAE | Stable Diffusion 2.1 | 4 | |
| | | SDXL-VAE | Stable Diffusion XL | 4 | |
| | | SD35-VAE | Stable Diffusion 3.5 | 16 | |
| | | FLUX1-VAE | FLUX.1 | 16 | |
| | | FLUX2-VAE | FLUX.2 | 32 | |
| | | SANA-VAE | SANA (DC-AE) | 32 | |
| | | Qwen-VAE | Qwen-Image | 16 | |
| |
|
| | ### VQModel |
| |
|
| | | Model | Source | latent_channels | num_vq_embeddings | vq_embed_dim | sample_size | |
| | |-------|--------|-----------------|-------------------|--------------|-------------| |
| | | VQDIFFUSION-VQVAE | VQ-Diffusion (microsoft/vq-diffusion-ithq) | 256 | 4096 | 128 | 32 | |
| | | IBQ-VQVAE-1024 | IBQ (TencentARC/SEED) | 256 | 1024 | 256 | 32 | |
| | | IBQ-VQVAE-8192 | IBQ (TencentARC/SEED) | 256 | 8192 | 256 | 32 | |
| | | IBQ-VQVAE-16384 | IBQ (TencentARC/SEED) | 256 | 16384 | 256 | 32 | |
| | | IBQ-VQVAE-262144 | IBQ (TencentARC/SEED) | 256 | 262144 | 256 | 32 | |
| | | MOVQGAN-67M | MOVQGAN | 4 | 16384 | 4 | 256 | |
| | | MOVQGAN-102M | MOVQGAN | 4 | 16384 | 4 | 256 | |
| | | MOVQGAN-270M | MOVQGAN | 4 | 16384 | 4 | 256 | |
| |
|
| | ## Diffusers usage |
| |
|
| | **AutoencoderKL** (SD, FLUX, SANA, Qwen, etc.): |
| |
|
| | ```python |
| | from diffusers import AutoencoderKL |
| | |
| | vae = AutoencoderKL.from_pretrained( |
| | "BiliSakura/VAEs", |
| | subfolder="SDXL-VAE", |
| | ) |
| | ``` |
| |
|
| | **VQModel** (VQ-Diffusion, IBQ, MOVQGAN): |
| |
|
| | ```python |
| | from diffusers import VQModel |
| | |
| | vae = VQModel.from_pretrained( |
| | "BiliSakura/VAEs", |
| | subfolder="VQDIFFUSION-VQVAE", |
| | ) |
| | ``` |
| |
|
| | ## Notes |
| |
|
| | - All models are VAE checkpoints intended for inference use in their |
| | corresponding pipelines. |
| | - Latent channel count is listed to help match with the correct backbone. |