charantejapolavarapu commited on
Commit
3e5066d
Β·
verified Β·
1 Parent(s): db7c5b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -1,3 +1,4 @@
 
1
  title: BitDance-14B-64x
2
  emoji: πŸš€
3
  colorFrom: red
@@ -10,4 +11,40 @@ license: apache-2.0
10
  short_description: Open-source autoregressive model with binary visual tokens.
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
  title: BitDance-14B-64x
3
  emoji: πŸš€
4
  colorFrom: red
 
11
  short_description: Open-source autoregressive model with binary visual tokens.
12
  ---
13
 
14
+ # πŸš€ BitDance-14B-64x
15
+
16
+ BitDance is a scalable autoregressive (AR) foundation model with **14 billion parameters**. It introduces a novel approach to image generation by predicting **binary visual tokens** instead of standard codebook indices.
17
+
18
+
19
+
20
+ ## 🌟 Key Features
21
+ - **Binary Visual Tokenizer:** Scales token entropy to $2^{256}$ states, providing a highly expressive yet compact discrete representation.
22
+ - **Binary Diffusion Head:** Replaces standard categorical classification with continuous-space diffusion for high-precision sampling in massive discrete spaces.
23
+ - **Next-Patch Diffusion:** A parallel decoding paradigm that predicts up to **64 tokens per step**, achieving a 30x speedup over traditional AR models for 1024x1024 resolution.
24
+ - **Multimodal Foundation:** Trained on large-scale multimodal data, excelling in prompt adherence, spatial reasoning, and high-fidelity photorealistic rendering.
25
+
26
+ ## πŸ› οΈ Performance
27
+ | Model | Tokens/Step | Speedup (vs. standard AR) | Target Resolution |
28
+ | :--- | :--- | :--- | :--- |
29
+ | BitDance-14B-16x | 16 | ~8x | 512px & 1024px |
30
+ | **BitDance-14B-64x** | **64** | **~30x** | **1024px** |
31
+
32
+
33
+
34
+ ## πŸš€ Quick Start (Local Setup)
35
+
36
+ If you wish to run the model locally using the `diffusers` library:
37
+
38
+ ```python
39
+ import torch
40
+ from diffusers import DiffusionPipeline
41
+
42
+ pipe = DiffusionPipeline.from_pretrained(
43
+ "shallowdream204/BitDance-14B-64x",
44
+ custom_pipeline="shallowdream204/BitDance-14B-64x",
45
+ torch_dtype=torch.bfloat16
46
+ ).to("cuda")
47
+
48
+ prompt = "A cinematic portrait of a futuristic explorer in a neon-lit cyberpunk city, ultra-detailed, 8k."
49
+ image = pipe(prompt=prompt, height=1024, width=1024).images[0]
50
+ image.save("output.png")