Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
title: BitDance-14B-64x
|
| 2 |
emoji: π
|
| 3 |
colorFrom: red
|
|
@@ -10,4 +11,40 @@ license: apache-2.0
|
|
| 10 |
short_description: Open-source autoregressive model with binary visual tokens.
|
| 11 |
---
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
title: BitDance-14B-64x
|
| 3 |
emoji: π
|
| 4 |
colorFrom: red
|
|
|
|
| 11 |
short_description: Open-source autoregressive model with binary visual tokens.
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# π BitDance-14B-64x
|
| 15 |
+
|
| 16 |
+
BitDance is a scalable autoregressive (AR) foundation model with **14 billion parameters**. It introduces a novel approach to image generation by predicting **binary visual tokens** instead of standard codebook indices.
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
## π Key Features
|
| 21 |
+
- **Binary Visual Tokenizer:** Scales token entropy to $2^{256}$ states, providing a highly expressive yet compact discrete representation.
|
| 22 |
+
- **Binary Diffusion Head:** Replaces standard categorical classification with continuous-space diffusion for high-precision sampling in massive discrete spaces.
|
| 23 |
+
- **Next-Patch Diffusion:** A parallel decoding paradigm that predicts up to **64 tokens per step**, achieving a 30x speedup over traditional AR models for 1024x1024 resolution.
|
| 24 |
+
- **Multimodal Foundation:** Trained on large-scale multimodal data, excelling in prompt adherence, spatial reasoning, and high-fidelity photorealistic rendering.
|
| 25 |
+
|
| 26 |
+
## π οΈ Performance
|
| 27 |
+
| Model | Tokens/Step | Speedup (vs. standard AR) | Target Resolution |
|
| 28 |
+
| :--- | :--- | :--- | :--- |
|
| 29 |
+
| BitDance-14B-16x | 16 | ~8x | 512px & 1024px |
|
| 30 |
+
| **BitDance-14B-64x** | **64** | **~30x** | **1024px** |
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
## π Quick Start (Local Setup)
|
| 35 |
+
|
| 36 |
+
If you wish to run the model locally using the `diffusers` library:
|
| 37 |
+
|
| 38 |
+
```python
|
| 39 |
+
import torch
|
| 40 |
+
from diffusers import DiffusionPipeline
|
| 41 |
+
|
| 42 |
+
pipe = DiffusionPipeline.from_pretrained(
|
| 43 |
+
"shallowdream204/BitDance-14B-64x",
|
| 44 |
+
custom_pipeline="shallowdream204/BitDance-14B-64x",
|
| 45 |
+
torch_dtype=torch.bfloat16
|
| 46 |
+
).to("cuda")
|
| 47 |
+
|
| 48 |
+
prompt = "A cinematic portrait of a futuristic explorer in a neon-lit cyberpunk city, ultra-detailed, 8k."
|
| 49 |
+
image = pipe(prompt=prompt, height=1024, width=1024).images[0]
|
| 50 |
+
image.save("output.png")
|