EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 9
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("zelaki/eq-vae", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Arxiv: https://arxiv.org/abs/2502.09509
EQ-VAE regularizes the latent space of pretrained autoencoders by enforcing equivariance under scaling and rotation transformations.
This model is a regularized version of SD-VAE. We finetune it with EQ-VAE regularization for 5 epochs on OpenImages.
from transformers import AutoencoderKL
model = AutoencoderKL.from_pretrained("zelaki/eq-vae")
Reconstruction performance of eq-vae-ema on Imagenet Validation Set.
| Metric | Score |
|---|---|
| FID | 0.82 |
| PSNR | 25.95 |
| LPIPS | 0.141 |
| SSIM | 0.72 |