metadata
license: apache-2.0
pipeline_tag: image-feature-extraction
tags:
- image-generation
- autoregressive
- vision
BitDance: Scaling Autoregressive Generative Models with Binary Tokens

This repository hosts the binary visual tokenizer weights for BitDance, as introduced in the paper BitDance: Scaling Autoregressive Generative Models with Binary Tokens.
BitDance addresses challenges in discrete autoregressive modeling via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction.
π¦ Binary Visual Tokenizers
We release three binary tokenizers with different downsampling ratios and vocabulary sizes.
| Vocabulary Size | Down Ratio | IN-256 PSNR | IN-256 SSIM | Weight | Config |
|---|---|---|---|---|---|
| $2^{32}$ | 16 | 24.90 | 0.72 | ae_d16c32.safetensors | ae_d16c32_config.json |
| $2^{128}$ | 32 | 23.26 | 0.67 | ae_d32c128.safetensors | ae_d32c128_config.json |
| $2^{256}$ | 32 | 25.29 | 0.74 | ae_d32c256.safetensors | ae_d32c256_config.json |
For detailed instructions and full generative model weights, please visit our GitHub repository.
πͺͺ License
BitDance is licensed under the Apache 2.0 license.
π Citation
If you find our work useful for your research, please consider citing our paper:
@article{ai2026bitdance,
title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and {Mao, Weijia} and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
journal = {arXiv preprint arXiv:2602.14041},
year = {2026}
}