metadata
license: cc-by-nc-4.0
tags:
- latent-diffusion
- vae
- imagenet
- CVPR2026
paper: https://huggingface.co/papers/2510.18457
🧩 VFM-VAE
Pretrained checkpoints, features, and samples for VFM-VAE, introduced in the paper:
Tianci Bi et al., "VFM-VAE: Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models", CVPR 2026 · arXiv:2510.18457
🎉 Accepted to CVPR 2026.
- 💻 Code: github.com/tianciB/VFM-VAE
- 📦 Includes: alignment data, ImageNet-256 & ImageNet-512 checkpoints, and diffusion samples
- 🪪 License: CC-BY-NC-4.0 © 2025 Tianci Bi, Xi'an Jiaotong University
📝 Citation
@inproceedings{bi2026vfmvae,
title = {Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models},
author = {Bi, Tianci and Zhang, Xiaoyi and Lu, Yan and Zheng, Nanning},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2026}
}