|
|
--- |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# BitDance: Scaling Autoregressive Generative Models with Binary Tokens |
|
|
|
|
|
<p align="center"> |
|
|
<a href="https://bitdance.csuhan.com/"> |
|
|
<img |
|
|
src="https://img.shields.io/badge/Project-Page-0A66C2?logo=chromewebstore&logoColor=0A66C2" |
|
|
alt="Project Page" |
|
|
/> |
|
|
</a> |
|
|
<a href="https://arxiv.org/abs/2602.14041"> |
|
|
<img |
|
|
src="https://img.shields.io/badge/arXiv paper-2602.14041-red?logo=arxiv&logoColor=red" |
|
|
alt="BitDance Paper on arXiv" |
|
|
/> |
|
|
</a> |
|
|
<a href="https://github.com/shallowdream204/BitDance"> |
|
|
<img |
|
|
src="https://img.shields.io/badge/Github-Code-blue?logo=github&logoColor=white" |
|
|
alt="BitDance GitHub" |
|
|
/> |
|
|
</a> |
|
|
<a href="https://huggingface.co/collections/shallowdream204/bitdance"> |
|
|
<img |
|
|
src="https://img.shields.io/badge/Weights-BitDance-yellow?logo=huggingface&logoColor=yellow" |
|
|
alt="BitDance Model" |
|
|
/> |
|
|
</a> |
|
|
<a href="https://huggingface.co/spaces/shallowdream204/BitDance-14B-64x"> |
|
|
<img |
|
|
src="https://img.shields.io/badge/Play with BitDance!-Demo-orange?logo=huggingface&logoColor=yellow" |
|
|
alt="BitDance Demo" |
|
|
/> |
|
|
</a> |
|
|
</p> |
|
|
|
|
|
<p align="center"><img src="https://github.com/shallowdream204/BitDance/raw/main/assets/speed.webp" width=90%"></p> |
|
|
|
|
|
|
|
|
> [Yuang Ai*](https://shallowdream204.github.io/), [Jiaming Han*](https://csuhan.com/), [Shaobin Zhuang*](https://scholar.google.com/citations?user=PGaDirMAAAAJ), [Weijia Mao](https://scholar.google.com/citations?user=S7bGBmkyNtEC), [Xuefeng Hu](https://xuefenghu.me/), [Ziyan Yang](https://ziyanyang.github.io/), [Zhenheng Yang](https://zhenheny.github.io/), [Huaibo Huang†](https://hhb072.github.io/), [Xiangyu Yue†](https://xyue.io/), [Hao Chen*†‡](https://haochen-rye.github.io/) |
|
|
> |
|
|
> <sup>*</sup> Equal Contribution <sup>†</sup> Corresponding Author <sup>‡</sup> Project Lead |
|
|
> |
|
|
> For visual generation, discrete autoregressive models often struggle with poor tokenizer reconstruction, difficulties in sampling from large vocabularies, and slow token-by-token generation speeds. We present **BitDance**, which addresses these challenges via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction. BitDance is an open-source discrete autoregressive foundation model with 14B parameters, trained on large-scale multimodal tokens. While maintaining the standard language modeling paradigm for text tokens, BitDance employs a next-patch diffusion paradigm for visual tokens to predict multiple tokens in parallel—up to 64 per step. This unified multimodal framework is simple, scalable, and capable of efficiently generating high-resolution, photorealistic images. |
|
|
|
|
|
This repository hosts the **BitDance** model weights for class-conditional image generation on ImageNet. For detailed instructions, please visit our [GitHub repository](https://github.com/shallowdream204/BitDance). |
|
|
|
|
|
|
|
|
## 🪪 License |
|
|
|
|
|
BitDance is licensed under the Apache 2.0 license. |
|
|
|
|
|
## 📖 Citation |
|
|
If you find our work useful for your research, please consider citing our paper: |
|
|
```bibtex |
|
|
@article{ai2026bitdance, |
|
|
title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens}, |
|
|
author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao}, |
|
|
journal = {arXiv preprint arXiv:2602.14041}, |
|
|
year = {2026} |
|
|
} |
|
|
``` |