eq-vae-ldm / README.md

Update model card with metadata, links, and usage example

98e94bf verified 6 months ago

2.46 kB

	---
	license: mit
	library_name: diffusers
	pipeline_tag: image-to-image
	---

	## EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
	Arxiv: [https://arxiv.org/abs/2502.09509](https://arxiv.org/abs/2502.09509)
	Project Page: [https://eq-vae.github.io/](https://eq-vae.github.io/)
	Code: [https://github.com/zelaki/eqvae](https://github.com/zelaki/eqvae)

	EQ-VAE regularizes the latent space of pretrained autoencoders by enforcing equivariance under scaling and rotation transformations.

	---
	#### Model Description
	This model is a regularized version of [SD-VAE](https://github.com/CompVis/latent-diffusion). We finetune it with EQ-VAE regularization for 5 epochs on OpenImages.

	## Model Usage
	These weights are intended to be used with the [EQ-VAE codebase](https://github.com/zelaki/eqvae) or the [CompVis Stable Diffusion codebase](https://github.com/CompVis/stable-diffusion).
	If you are looking for the model to use with the 🧨 diffusers library, [come here](https://huggingface.co/zelaki/eq-vae).

	### Quick Start with 🧨 Diffusers
	If you just want to use EQ-VAE to speed up 🚀 the training on your diffusion model, you can use our HuggingFace checkpoints 🤗. We provide two models: [eq-vae](https://huggingface.co/zelaki/eq-vae) and [eq-vae-ema](https://huggingface.co/zelaki/eq-vae-ema).

	```python
	from diffusers import AutoencoderKL
	eqvae = AutoencoderKL.from_pretrained("zelaki/eq-vae")
	```
	If you are looking for the weights in the original LDM format you can find them here: [eq-vae-ldm](https://huggingface.co/zelaki/eq-vae-ldm), [eq-vae-ema-ldm](https://huggingface.co/zelaki/eq-vae-ema-ldm)

	#### Metrics
	Reconstruction performance of eq-vae-ema on Imagenet Validation Set.

	\| Metric \| Score \|
	\|------------\|-----------\|
	\| FID \| 0.82 \|
	\| PSNR \| 25.95 \|
	\| LPIPS \| 0.141 \|
	\| SSIM \| 0.72 \|
	---

	## Acknowledgement
	This code is mainly built upon [LDM](https://github.com/CompVis/latent-diffusion) and [fastDiT](https://github.com/chuanyangjin/fast-DiT).

	## Citation
	```bibtex
	@inproceedings{
	kouzelis2025eqvae,
	title={{EQ}-{VAE}: Equivariance Regularized Latent Space for Improved Generative Image Modeling},
	author={Theodoros Kouzelis and Ioannis Kakogeorgiou and Spyros Gidaris and Nikos Komodakis},
	booktitle={Forty-second International Conference on Machine Learning},
	year={2025},
	url={https://openreview.net/forum?id=UWhW5YYLo6}
	}
	```