Sana 0.6B โ€” ONNX for In-Browser WebGPU Inference

Generate 1024x1024 to 4096x4096 images entirely in the browser using WebGPU.

Requirements

  • GPU with 4+ GB VRAM (NVIDIA, AMD, or Apple Silicon)
  • Chrome, Edge, or Firefox with WebGPU support

Models

Component File Size Precision
CLIP text encoder onnx-community/clip-vit-large-patch14-ONNX 432 MB uint8
DiT 1024 1024/sana_dit_1024.onnx + .data 2.3 GB float32
DiT 2048 2048/sana_dit_2048.onnx + .data 2.3 GB float32
DiT 4096 4096/sana_dit_4096.onnx + .data 2.3 GB float32
VAE 1024 1024/sana_vae_1024.onnx + .data 608 MB float32
VAE 2048 2048/sana_vae_2048.onnx + .data 608 MB float32
VAE 4096 4096/sana_vae_4096.onnx + .data 608 MB float32

Note: DiT must be float32 โ€” Sana's linear attention produces NaN in fp16.

Demo

bradAGI/web-stable-diffusion

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for brad-agi/sana-0.6b-onnx-webgpu

Unable to build the model tree, the base model loops to the model itself. Learn more.