BitDance-Tokenizer / README.md
nielsr's picture
nielsr HF Staff
Improve model card for BitDance: Add metadata and tokenizer details
65ba5fb verified
|
raw
history blame
3.36 kB
metadata
license: apache-2.0
pipeline_tag: image-feature-extraction
tags:
  - image-generation
  - autoregressive
  - vision

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Project Page BitDance Paper BitDance GitHub BitDance Model

This repository hosts the binary visual tokenizer weights for BitDance, as introduced in the paper BitDance: Scaling Autoregressive Generative Models with Binary Tokens.

BitDance addresses challenges in discrete autoregressive modeling via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction.

πŸ¦„ Binary Visual Tokenizers

We release three binary tokenizers with different downsampling ratios and vocabulary sizes.

Vocabulary Size Down Ratio IN-256 PSNR IN-256 SSIM Weight Config
$2^{32}$ 16 24.90 0.72 ae_d16c32.safetensors ae_d16c32_config.json
$2^{128}$ 32 23.26 0.67 ae_d32c128.safetensors ae_d32c128_config.json
$2^{256}$ 32 25.29 0.74 ae_d32c256.safetensors ae_d32c256_config.json

For detailed instructions and full generative model weights, please visit our GitHub repository.

πŸͺͺ License

BitDance is licensed under the Apache 2.0 license.

πŸ“– Citation

If you find our work useful for your research, please consider citing our paper:

@article{ai2026bitdance,
  title   = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
  author  = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and {Mao, Weijia} and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
  journal = {arXiv preprint arXiv:2602.14041},
  year    = {2026}
}