Improve model card for BitDance: Add metadata and tokenizer details

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +22 -16
README.md CHANGED
@@ -1,5 +1,10 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
4
 
5
  # BitDance: Scaling Autoregressive Generative Models with Binary Tokens
@@ -11,10 +16,10 @@ license: apache-2.0
11
  alt="Project Page"
12
  />
13
  </a>
14
- <a href="https://arxiv.org/abs/2602.14041">
15
  <img
16
- src="https://img.shields.io/badge/arXiv paper-2602.14041-red?logo=arxiv&logoColor=red"
17
- alt="BitDance Paper on arXiv"
18
  />
19
  </a>
20
  <a href="https://github.com/shallowdream204/BitDance">
@@ -29,36 +34,37 @@ license: apache-2.0
29
  alt="BitDance Model"
30
  />
31
  </a>
32
- <a href="https://huggingface.co/spaces/shallowdream204/BitDance-14B-64x">
33
- <img
34
- src="https://img.shields.io/badge/Play with BitDance!-Demo-orange?logo=huggingface&logoColor=yellow"
35
- alt="BitDance Demo"
36
- />
37
- </a>
38
  </p>
39
 
40
  <p align="center"><img src="https://github.com/shallowdream204/BitDance/raw/main/assets/speed.webp" width=90%"></p>
41
 
 
42
 
43
- > [Yuang Ai*](https://shallowdream204.github.io/), [Jiaming Han*](https://csuhan.com/), [Shaobin Zhuang*](https://scholar.google.com/citations?user=PGaDirMAAAAJ), [Weijia Mao](https://scholar.google.com/citations?user=S7bGBmkyNtEC), [Xuefeng Hu](https://xuefenghu.me/), [Ziyan Yang](https://ziyanyang.github.io/), [Zhenheng Yang](https://zhenheny.github.io/), [Huaibo Huang†](https://hhb072.github.io/), [Xiangyu Yue†](https://xyue.io/), [Hao Chen*†‡](https://haochen-rye.github.io/)
44
- >
45
- > <sup>*</sup> Equal Contribution&nbsp;&nbsp;<sup>†</sup> Corresponding Author&nbsp;&nbsp;<sup>‡</sup> Project Lead
46
- >
47
- > For visual generation, discrete autoregressive models often struggle with poor tokenizer reconstruction, difficulties in sampling from large vocabularies, and slow token-by-token generation speeds. We present **BitDance**, which addresses these challenges via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction. BitDance is an open-source discrete autoregressive foundation model with 14B parameters, trained on large-scale multimodal tokens. While maintaining the standard language modeling paradigm for text tokens, BitDance employs a next-patch diffusion paradigm for visual tokens to predict multiple tokens in parallel—up to 64 per step. This unified multimodal framework is simple, scalable, and capable of efficiently generating high-resolution, photorealistic images.
48
 
49
- This repository hosts the **BitDance** tokenizer weights. For detailed instructions, please visit our [GitHub repository](https://github.com/shallowdream204/BitDance).
50
 
 
 
 
 
 
 
 
 
 
51
 
52
  ## 🪪 License
53
 
54
  BitDance is licensed under the Apache 2.0 license.
55
 
56
  ## 📖 Citation
 
57
  If you find our work useful for your research, please consider citing our paper:
58
  ```bibtex
59
  @article{ai2026bitdance,
60
  title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
61
- author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
62
  journal = {arXiv preprint arXiv:2602.14041},
63
  year = {2026}
64
  }
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-feature-extraction
4
+ tags:
5
+ - image-generation
6
+ - autoregressive
7
+ - vision
8
  ---
9
 
10
  # BitDance: Scaling Autoregressive Generative Models with Binary Tokens
 
16
  alt="Project Page"
17
  />
18
  </a>
19
+ <a href="https://huggingface.co/papers/2602.14041">
20
  <img
21
+ src="https://img.shields.io/badge/Paper-arXiv-red?logo=arxiv&logoColor=red"
22
+ alt="BitDance Paper"
23
  />
24
  </a>
25
  <a href="https://github.com/shallowdream204/BitDance">
 
34
  alt="BitDance Model"
35
  />
36
  </a>
 
 
 
 
 
 
37
  </p>
38
 
39
  <p align="center"><img src="https://github.com/shallowdream204/BitDance/raw/main/assets/speed.webp" width=90%"></p>
40
 
41
+ This repository hosts the **binary visual tokenizer** weights for BitDance, as introduced in the paper [BitDance: Scaling Autoregressive Generative Models with Binary Tokens](https://huggingface.co/papers/2602.14041).
42
 
43
+ BitDance addresses challenges in discrete autoregressive modeling via a large-vocabulary binary tokenizer, a binary diffusion head for sampling in large discrete space, and a next-patch diffusion paradigm that enables efficient multitoken prediction.
 
 
 
 
44
 
45
+ ## 🦄 Binary Visual Tokenizers
46
 
47
+ We release three binary tokenizers with different downsampling ratios and vocabulary sizes.
48
+
49
+ | Vocabulary Size | Down Ratio | IN-256 PSNR | IN-256 SSIM | Weight | Config |
50
+ |:---: |:---:|:---:|:---:|:---:|:---:|
51
+ | $2^{32}$ | 16 | 24.90 | 0.72 |[ae_d16c32.safetensors](https://huggingface.co/shallowdream204/BitDance-Tokenizer/blob/main/ae_d16c32.safetensors) | [ae_d16c32_config.json](https://huggingface.co/shallowdream204/BitDance-Tokenizer/blob/main/ae_d16c32_config.json) |
52
+ | $2^{128}$ | 32 | 23.26 | 0.67 |[ae_d32c128.safetensors](https://huggingface.co/shallowdream204/BitDance-Tokenizer/blob/main/ae_d32c128.safetensors) | [ae_d32c128_config.json](https://huggingface.co/shallowdream204/BitDance-Tokenizer/blob/main/ae_d32c128_config.json) |
53
+ | $2^{256}$ | 32 | 25.29 | 0.74 |[ae_d32c256.safetensors](https://huggingface.co/shallowdream204/BitDance-Tokenizer/blob/main/ae_d32c256.safetensors) | [ae_d32c256_config.json](https://huggingface.co/shallowdream204/BitDance-Tokenizer/blob/main/ae_d32c256_config.json) |
54
+
55
+ For detailed instructions and full generative model weights, please visit our [GitHub repository](https://github.com/shallowdream204/BitDance).
56
 
57
  ## 🪪 License
58
 
59
  BitDance is licensed under the Apache 2.0 license.
60
 
61
  ## 📖 Citation
62
+
63
  If you find our work useful for your research, please consider citing our paper:
64
  ```bibtex
65
  @article{ai2026bitdance,
66
  title = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
67
+ author = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and {Mao, Weijia} and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
68
  journal = {arXiv preprint arXiv:2602.14041},
69
  year = {2026}
70
  }