jadechoghari
/

mar

+---
+library_name: diffusers
+license: mit
+---
+# Autoregressive Image Generation without Vector Quantization
+## About
+This model (MAR) introduces a novel approach to autoregressive image generation by eliminating the need for vector quantization.
+Instead of relying on discrete tokens, the model operates in a continuous-valued space using a diffusion process to model the per-token probability distribution.
+By employing a Diffusion Loss function, the model achieves efficient and high-quality image generation while benefiting from the speed advantages of autoregressive sequence modeling.
+This approach simplifies the generation process, making it applicable to broader continuous-valued domains beyond just image synthesis.
+It is based on [this paper](https://arxiv.org/abs/2406.11838)
+## Usage:
+You can easily load it through the Hugging Face `DiffusionPipeline` and optionally customize various parameters such as the model type, number of steps, and class labels.
+```python
+from diffusers import DiffusionPipeline
+# load the pretrained model
+pipeline = DiffusionPipeline.from_pretrained("jadechoghari/mar", trust_remote_code=True, custom_pipeline="jadechoghari/mar")
+# generate an image with the model
+generated_image = pipeline(
+    model_type="mar_base",  # choose from 'mar_base', 'mar_large', or 'mar_huge'
+    seed=42,                # set a seed for reproducibility
+    num_ar_steps=64,        # number of autoregressive steps
+    class_labels=[207, 360, 388],  # provide valid ImageNet class labels
+    cfg_scale=4,            # classifier-free guidance scale
+    output_dir="./images",   # directory to save generated images
+)
+# display the generated image
+generated_image.show()
+```
+<p align="center">
+  <img src="https://github.com/LTH14/mar/raw/main/demo/visual.png" width="500">
+</p>
+This code loads the model, configures it for image generation, and saves the output to a specified directory.
+We offer three pre-trained MAR models in `safetensors` format:
+- `mar-base.safetensors`
+- `mar-large.safetensors`
+- `mar-huge.safetensors`
+<!-- <p align="center">
+  <img src="https://github.com/LTH14/mar/raw/main/demo/visual.png" width="720">
+</p> -->
+This is a Hugging Face Diffusers/GPU implementation of the paper [Autoregressive Image Generation without Vector Quantization](https://arxiv.org/abs/2406.11838)
+The Official PyTorch Implementation is released in [this repository](https://github.com/LTH14/mar)
+```
+@article{li2024autoregressive,
+  title={Autoregressive Image Generation without Vector Quantization},
+  author={Li, Tianhong and Tian, Yonglong and Li, He and Deng, Mingyang and He, Kaiming},
+  journal={arXiv preprint arXiv:2406.11838},
+  year={2024}
+}
+```
+## Acknowledgements
+We thank Congyue Deng and Xinlei Chen for helpful discussion. We thank
+Google TPU Research Cloud (TRC) for granting us access to TPUs, and Google Cloud Platform for
+supporting GPU resources.
+A large portion of codes in this repo is based on [MAE](https://github.com/facebookresearch/mae), [MAGE](https://github.com/LTH14/mage) and [DiT](https://github.com/facebookresearch/DiT).
+## Contact
+If you have any questions, feel free to contact me through email (tianhong@mit.edu). Enjoy!