BitDance-Tokenizer / README.md

nielsr HF Staff

Improve model card for BitDance: Add metadata and tokenizer details

65ba5fb verified 1 day ago

preview code

raw

history blame

3.36 kB

metadata

license: apache-2.0
pipeline_tag: image-feature-extraction
tags:
  - image-generation
  - autoregressive
  - vision

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

This repository hosts the binary visual tokenizer weights for BitDance, as introduced in the paper BitDance: Scaling Autoregressive Generative Models with Binary Tokens.

BitDance addresses challenges in discrete autoregressive modeling via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction.

🦄 Binary Visual Tokenizers

We release three binary tokenizers with different downsampling ratios and vocabulary sizes.

Vocabulary Size	Down Ratio	IN-256 PSNR	IN-256 SSIM	Weight	Config
$2^{32}$	16	24.90	0.72	ae_d16c32.safetensors	ae_d16c32_config.json
$2^{128}$	32	23.26	0.67	ae_d32c128.safetensors	ae_d32c128_config.json
$2^{256}$	32	25.29	0.74	ae_d32c256.safetensors	ae_d32c256_config.json

For detailed instructions and full generative model weights, please visit our GitHub repository.

🪪 License

BitDance is licensed under the Apache 2.0 license.

📖 Citation

If you find our work useful for your research, please consider citing our paper:

@article{ai2026bitdance,
  title   = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
  author  = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and {Mao, Weijia} and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
  journal = {arXiv preprint arXiv:2602.14041},
  year    = {2026}
}