---
license: apache-2.0
tags:
  - vae
  - autoencoder
  - image
  - stable-diffusion
  - sdxl
  - flux
  - sana
  - qwen
pipeline_tag: image-to-image
library_name: diffusers
language:
  - en
---

# VAEs for Image Generation

This repository hosts a curated collection of VAE checkpoints used by diffusion
and transformer-based image generation pipelines.

## Available VAEs

### AutoencoderKL

| Model | Source | Latent Channels |
|-------|--------|-----------------|
| SD21-VAE | Stable Diffusion 2.1 | 4 |
| SDXL-VAE | Stable Diffusion XL | 4 |
| SD35-VAE | Stable Diffusion 3.5 | 16 |
| FLUX1-VAE | FLUX.1 | 16 |
| FLUX2-VAE | FLUX.2 | 32 |
| SANA-VAE | SANA (DC-AE) | 32 |
| Qwen-VAE | Qwen-Image | 16 |

### VQModel

| Model | Source | latent_channels | num_vq_embeddings | vq_embed_dim | sample_size |
|-------|--------|-----------------|-------------------|--------------|-------------|
| VQDIFFUSION-VQVAE | VQ-Diffusion (microsoft/vq-diffusion-ithq) | 256 | 4096 | 128 | 32 |
| IBQ-VQVAE-1024 | IBQ (TencentARC/SEED) | 256 | 1024 | 256 | 32 |
| IBQ-VQVAE-8192 | IBQ (TencentARC/SEED) | 256 | 8192 | 256 | 32 |
| IBQ-VQVAE-16384 | IBQ (TencentARC/SEED) | 256 | 16384 | 256 | 32 |
| IBQ-VQVAE-262144 | IBQ (TencentARC/SEED) | 256 | 262144 | 256 | 32 |
| MOVQGAN-67M | MOVQGAN | 4 | 16384 | 4 | 256 |
| MOVQGAN-102M | MOVQGAN | 4 | 16384 | 4 | 256 |
| MOVQGAN-270M | MOVQGAN | 4 | 16384 | 4 | 256 |

## Diffusers usage

**AutoencoderKL** (SD, FLUX, SANA, Qwen, etc.):

```python
from diffusers import AutoencoderKL

vae = AutoencoderKL.from_pretrained(
    "BiliSakura/VAEs",
    subfolder="SDXL-VAE",
)
```

**VQModel** (VQ-Diffusion, IBQ, MOVQGAN):

```python
from diffusers import VQModel

vae = VQModel.from_pretrained(
    "BiliSakura/VAEs",
    subfolder="VQDIFFUSION-VQVAE",
)
```

## Notes

- All models are VAE checkpoints intended for inference use in their
  corresponding pipelines.
- Latent channel count is listed to help match with the correct backbone.