File size: 1,931 Bytes
3e5066d 64f9055 75d42b4 932592f 64f9055 932592f 64f9055 932592f 3e5066d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ---
title: BitDance-14B-64x
emoji: π
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Open-source autoregressive model with binary visual tokens.
---
# π BitDance-14B-64x
BitDance is a scalable autoregressive (AR) foundation model with **14 billion parameters**. It introduces a novel approach to image generation by predicting **binary visual tokens** instead of standard codebook indices.
## π Key Features
- **Binary Visual Tokenizer:** Scales token entropy to $2^{256}$ states, providing a highly expressive yet compact discrete representation.
- **Binary Diffusion Head:** Replaces standard categorical classification with continuous-space diffusion for high-precision sampling in massive discrete spaces.
- **Next-Patch Diffusion:** A parallel decoding paradigm that predicts up to **64 tokens per step**, achieving a 30x speedup over traditional AR models for 1024x1024 resolution.
- **Multimodal Foundation:** Trained on large-scale multimodal data, excelling in prompt adherence, spatial reasoning, and high-fidelity photorealistic rendering.
## π οΈ Performance
| Model | Tokens/Step | Speedup (vs. standard AR) | Target Resolution |
| :--- | :--- | :--- | :--- |
| BitDance-14B-16x | 16 | ~8x | 512px & 1024px |
| **BitDance-14B-64x** | **64** | **~30x** | **1024px** |
## π Quick Start (Local Setup)
If you wish to run the model locally using the `diffusers` library:
```python
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"shallowdream204/BitDance-14B-64x",
custom_pipeline="shallowdream204/BitDance-14B-64x",
torch_dtype=torch.bfloat16
).to("cuda")
prompt = "A cinematic portrait of a futuristic explorer in a neon-lit cyberpunk city, ultra-detailed, 8k."
image = pipe(prompt=prompt, height=1024, width=1024).images[0]
image.save("output.png") |