File size: 3,558 Bytes
a970064 0435a8d a970064 6522827 a970064 6522827 a970064 1fc807d a970064 1fc807d a970064 25cf888 a970064 6522827 a970064 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
license: apache-2.0
---
# BitDance: Scaling Autoregressive Generative Models with Binary Tokens
<p align="center">
<a href="https://bitdance.csuhan.com/">
<img
src="https://img.shields.io/badge/Project-Page-0A66C2?logo=chromewebstore&logoColor=0A66C2"
alt="Project Page"
/>
</a>
<a href="https://arxiv.org/abs/2602.14041">
<img
src="https://img.shields.io/badge/arXiv paper-2602.14041-red?logo=arxiv&logoColor=red"
alt="BitDance Paper on arXiv"
/>
</a>
<a href="https://github.com/shallowdream204/BitDance">
<img
src="https://img.shields.io/badge/Github-Code-blue?logo=github&logoColor=white"
alt="BitDance GitHub"
/>
</a>
<a href="https://huggingface.co/collections/shallowdream204/bitdance">
<img
src="https://img.shields.io/badge/Weights-BitDance-yellow?logo=huggingface&logoColor=yellow"
alt="BitDance Model"
/>
</a>
<a href="https://huggingface.co/spaces/shallowdream204/BitDance-14B-64x">
<img
src="https://img.shields.io/badge/Play with BitDance!-Demo-orange?logo=huggingface&logoColor=yellow"
alt="BitDance Demo"
/>
</a>
</p>
<p align="center"><img src="https://github.com/shallowdream204/BitDance/raw/main/assets/speed.webp" width=90%"></p>
> [Yuang Ai*](https://shallowdream204.github.io/), [Jiaming Han*](https://csuhan.com/), [Shaobin Zhuang*](https://scholar.google.com/citations?user=PGaDirMAAAAJ), [Weijia Mao](https://scholar.google.com/citations?user=S7bGBmkyNtEC), [Xuefeng Hu](https://xuefenghu.me/), [Ziyan Yang](https://ziyanyang.github.io/), [Zhenheng Yang](https://zhenheny.github.io/), [Huaibo Huang†](https://hhb072.github.io/), [Xiangyu Yue†](https://xyue.io/), [Hao Chen*†‡](https://haochen-rye.github.io/)
>
> <sup>*</sup> Equal Contribution <sup>†</sup> Corresponding Author <sup>‡</sup> Project Lead
>
> For visual generation, discrete autoregressive models often struggle with poor tokenizer reconstruction, difficulties in sampling from large vocabularies, and slow token-by-token generation speeds. We present **BitDance**, which addresses these challenges via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction. BitDance is an open-source discrete autoregressive foundation model with 14B parameters, trained on large-scale multimodal tokens. While maintaining the standard language modeling paradigm for text tokens, BitDance employs a next-patch diffusion paradigm for visual tokens to predict multiple tokens in parallel—up to 64 per step. This unified multimodal framework is simple, scalable, and capable of efficiently generating high-resolution, photorealistic images.
This repository hosts the **BitDance** model weights for class-conditional image generation on ImageNet. For detailed instructions, please visit our [GitHub repository](https://github.com/shallowdream204/BitDance).
## 🪪 License
BitDance is licensed under the Apache 2.0 license.
## 📖 Citation
If you find our work useful for your research, please consider citing our paper:
```bibtex
@article{ai2026bitdance,
title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
journal = {arXiv preprint arXiv:2602.14041},
year = {2026}
}
``` |