Buckets:

Imosu
/

My-Image-Space-storage

Imosu/My-Image-Space-storage / hf_models /BitDance-14B-16x-diffusers

67 GB

145 files

Updated 15 days ago

Ctrl+K

Name	Size	Uploaded	Xet hash
.cache		15 days ago
autoencoder		15 days ago
bitdance_diffusers		15 days ago
diffusion_head		15 days ago
projector		15 days ago
text_encoder		15 days ago
tokenizer		15 days ago
.gitattributes	1.58 kB xet	15 days ago	7e06b6bf
README.md	3.91 kB xet	15 days ago	d7c46f22
bitdance_14b_16x.png	1.38 MB xet	15 days ago	975f0438
model_index.json	511 Bytes xet	15 days ago	9c2a167a
pipeline.py	174 Bytes xet	15 days ago	a9548894
test_bitdance.py	1.02 kB xet	15 days ago	786e7f82

README.md

BitDance-14B-16x (Diffusers)

Diffusers-converted checkpoint for BitDance-14B-16x with bundled custom pipeline code (bitdance_diffusers) so it can be loaded directly with DiffusionPipeline.

Quickstart (native diffusers)

import torch
from diffusers import DiffusionPipeline

# Local path (recommended - no trust_remote_code needed)
model_path = "BiliSakura/BitDance-14B-16x-diffusers"
pipe = DiffusionPipeline.from_pretrained(
    model_path,
    custom_pipeline=model_path,
    torch_dtype=torch.bfloat16,
).to("cuda")

result = pipe(
    prompt="A close-up portrait in a cinematic photography style, capturing a girl-next-door look on a sunny daytime urban street. She wears a khaki sweater, with long, flowing hair gently draped over her shoulders. Her head is turned slightly, revealing soft facial features illuminated by realistic, delicate sunlight coming from the left. The sunlight subtly highlights individual strands of her hair. The image has a Canon film-like color tone, evoking a warm nostalgic atmosphere.",
    height=1024,
    width=1024,
    num_inference_steps=50,
    guidance_scale=7.5,
    show_progress_bar=True,
)
result.images[0].save("bitdance_14b_16x.png")

Test Running

Run tests from the model directory in your active Python environment:

python test_bitdance.py

VRAM Usage by Resolution

Measured on NVIDIA A100-SXM4-80GB using:

dtype=torch.bfloat16
num_inference_steps=30
guidance_scale=7.5
prompt: A cinematic landscape photo of snowy mountains at sunrise.

Resolution	Peak Allocated VRAM (GiB)	Peak Reserved VRAM (GiB)	Time (s)	Status
512x512	32.67	33.47	13.71	ok
1024x1024	35.51	38.76	54.47	ok
1280x768	35.28	38.34	50.97	ok
768x1280	35.28	38.34	51.22	ok
1536x640	35.28	38.34	51.29	ok
2048x512	35.51	38.76	54.61	ok

Model Metadata

Pipeline class: BitDanceDiffusionPipeline
Diffusers version in config: 0.36.0
Parallel prediction factor: 16
Text stack: Qwen3ForCausalLM + Qwen2TokenizerFast
Supported resolutions include 1024x1024, 1280x768, 768x1280, 2048x512, and more (see model_index.json)

Citation

If you use this model, please cite BitDance and Diffusers:

@article{ai2026bitdance,
  title   = {BitDance: Scaling Autoregressive Generative Models with Binary Tokens},
  author  = {Ai, Yuang and Han, Jiaming and Zhuang, Shaobin and Hu, Xuefeng and Yang, Ziyan and Yang, Zhenheng and Huang, Huaibo and Yue, Xiangyu and Chen, Hao},
  journal = {arXiv preprint arXiv:2602.14041},
  year    = {2026}
}

@inproceedings{von-platen-etal-2022-diffusers,
  title     = {Diffusers: State-of-the-art diffusion models},
  author    = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Damar Jablonski and Hernan Bischof and Thomas Wolf},
  booktitle = {GitHub repository},
  year      = {2022},
  url       = {https://github.com/huggingface/diffusers}
}

License

This repository is distributed under the Apache-2.0 license, consistent with the upstream BitDance release.

Total size: 67 GB

Files: 145

Last updated: May 13

Pre-warmed CDN: US EU US EU