shallowdream204
/

BitDance-ImageNet

Model card Files Files and versions

BitDance-ImageNet / README.md

shallowdream204's picture

shallowdream204

Update README.md

6522827 verified 1 day ago

|

history blame contribute delete

3.56 kB

	---
	license: apache-2.0
	---

	# BitDance: Scaling Autoregressive Generative Models with Binary Tokens

	<p align="center">
	<a href="https://bitdance.csuhan.com/">
	<img
	src="https://img.shields.io/badge/Project-Page-0A66C2?logo=chromewebstore&logoColor=0A66C2"
	alt="Project Page"
	/>
	</a>
	<a href="https://arxiv.org/abs/2602.14041">
	<img
	src="https://img.shields.io/badge/arXiv paper-2602.14041-red?logo=arxiv&logoColor=red"
	alt="BitDance Paper on arXiv"
	/>
	</a>
	<a href="https://github.com/shallowdream204/BitDance">
	<img
	src="https://img.shields.io/badge/Github-Code-blue?logo=github&logoColor=white"
	alt="BitDance GitHub"
	/>
	</a>
	<a href="https://huggingface.co/collections/shallowdream204/bitdance">
	<img
	src="https://img.shields.io/badge/Weights-BitDance-yellow?logo=huggingface&logoColor=yellow"
	alt="BitDance Model"
	/>
	</a>
	<a href="https://huggingface.co/spaces/shallowdream204/BitDance-14B-64x">
	<img
	src="https://img.shields.io/badge/Play with BitDance!-Demo-orange?logo=huggingface&logoColor=yellow"
	alt="BitDance Demo"
	/>
	</a>
	</p>

	<p align="center"><img src="https://github.com/shallowdream204/BitDance/raw/main/assets/speed.webp" width=90%"></p>


	> [Yuang Ai](https://shallowdream204.github.io/), [Jiaming Han](https://csuhan.com/), [Shaobin Zhuang](https://scholar.google.com/citations?user=PGaDirMAAAAJ), [Weijia Mao](https://scholar.google.com/citations?user=S7bGBmkyNtEC), [Xuefeng Hu](https://xuefenghu.me/), [Ziyan Yang](https://ziyanyang.github.io/), [Zhenheng Yang](https://zhenheny.github.io/), [Huaibo Huang†](https://hhb072.github.io/), [Xiangyu Yue†](https://xyue.io/), [Hao Chen†‡](https://haochen-rye.github.io/)
	>
	> <sup>*</sup> Equal Contribution  <sup>†</sup> Corresponding Author  <sup>‡</sup> Project Lead
	>
	> For visual generation, discrete autoregressive models often struggle with poor tokenizer reconstruction, difficulties in sampling from large vocabularies, and slow token-by-token generation speeds. We present BitDance, which addresses these challenges via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction. BitDance is an open-source discrete autoregressive foundation model with 14B parameters, trained on large-scale multimodal tokens. While maintaining the standard language modeling paradigm for text tokens, BitDance employs a next-patch diffusion paradigm for visual tokens to predict multiple tokens in parallel—up to 64 per step. This unified multimodal framework is simple, scalable, and capable of efficiently generating high-resolution, photorealistic images.

	This repository hosts the BitDance model weights for class-conditional image generation on ImageNet. For detailed instructions, please visit our [GitHub repository](https://github.com/shallowdream204/BitDance).


	## 🪪 License

	BitDance is licensed under the Apache 2.0 license.

	## 📖 Citation
	If you find our work useful for your research, please consider citing our paper:
	```bibtex
	@article{ai2026bitdance,
	title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
	author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
	journal = {arXiv preprint arXiv:2602.14041},
	year = {2026}
	}
	```