BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Yuang Ai*, Jiaming Han*, Shaobin Zhuang*, Weijia Mao, Xuefeng Hu, Ziyan Yang, Zhenheng Yang, Huaibo Huang†, Xiangyu Yue†, Hao Chen*†‡

^* Equal Contribution ^† Corresponding Author ^‡ Project Lead

For visual generation, discrete autoregressive models often struggle with poor tokenizer reconstruction, difficulties in sampling from large vocabularies, and slow token-by-token generation speeds. We present BitDance, which addresses these challenges via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction. BitDance is an open-source discrete autoregressive foundation model with 14B parameters, trained on large-scale multimodal tokens. While maintaining the standard language modeling paradigm for text tokens, BitDance employs a next-patch diffusion paradigm for visual tokens to predict multiple tokens in parallel—up to 64 per step. This unified multimodal framework is simple, scalable, and capable of efficiently generating high-resolution, photorealistic images.

This repository hosts the BitDance model weights for class-conditional image generation on ImageNet. For detailed instructions, please visit our GitHub repository.

🪪 License

BitDance is licensed under the Apache 2.0 license.

📖 Citation

If you find our work useful for your research, please consider citing our paper:

@article{ai2026bitdance,
  title   = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
  author  = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
  journal = {arXiv preprint arXiv:2602.14041},
  year    = {2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shallowdream204/BitDance-ImageNet

Finetunes

1 model

Collection including shallowdream204/BitDance-ImageNet

BitDance

Collection

BitDance: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model. • 10 items • Updated Mar 2 • 11

Paper for shallowdream204/BitDance-ImageNet

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Paper • 2602.14041 • Published Feb 15 • 54