| license: apple-ascl | |
| tags: | |
| - mdm | |
| # Matryoshka Diffusion Models | |
| Matryoshka Diffusion Models was introduced in [the paper of the same name](https://huggingface.co/papers/2310.15111), by Jiatao Gu,Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly. | |
| This repository contains the **Flickr 64** checkpoint. | |
|  | |
| ### Highlights | |
| * This checkpoint was trained on a dataset of 50M text-image pairs collected from Flickr. | |
| * This model was trained using a single UNet (not nested), and generates images with a resolution of 64 × 64. | |
| * Despite training on relatively small datasets, MDMs show strong zero-shot capabilities of generating high-resolution images and videos. | |
| ## Checkpoints | |
| | Model | Dataset | Resolution | Nested UNets | | |
| |---------------------------------------------------------|------------|-------------|--------------| | |
| | [mdm-flickr-64](https://hf.co/pcuenq/mdm-flickr-64) | Flickr 50M | 64 × 64 | ❎ | | |
| | [mdm-flickr-256](https://hf.co/pcuenq/mdm-flickr-256) | Flickr 50M | 256 × 256 | ✅ | | |
| | [mdm-flickr-1024](https://hf.co/pcuenq/mdm-flickr-1024) | Flickr 50M | 1024 × 1024 | ✅ | | |
| ## How to Use | |
| Please, refer to the [original repository](https://github.com/apple/ml-mdm) for training and inference instructions. | |
| ## Citation | |
| ``` | |
| @misc{gu2023matryoshkadiffusionmodels, | |
| title={Matryoshka Diffusion Models}, | |
| author={Jiatao Gu and Shuangfei Zhai and Yizhe Zhang and Josh Susskind and Navdeep Jaitly}, | |
| year={2023}, | |
| eprint={2310.15111}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV}, | |
| url={https://arxiv.org/abs/2310.15111}, | |
| } | |
| ``` |