Sana 0.6B โ ONNX for In-Browser WebGPU Inference
Generate 1024x1024 to 4096x4096 images entirely in the browser using WebGPU.
Requirements
- GPU with 4+ GB VRAM (NVIDIA, AMD, or Apple Silicon)
- Chrome, Edge, or Firefox with WebGPU support
Models
| Component | File | Size | Precision |
|---|---|---|---|
| CLIP text encoder | onnx-community/clip-vit-large-patch14-ONNX | 432 MB | uint8 |
| DiT 1024 | 1024/sana_dit_1024.onnx + .data | 2.3 GB | float32 |
| DiT 2048 | 2048/sana_dit_2048.onnx + .data | 2.3 GB | float32 |
| DiT 4096 | 4096/sana_dit_4096.onnx + .data | 2.3 GB | float32 |
| VAE 1024 | 1024/sana_vae_1024.onnx + .data | 608 MB | float32 |
| VAE 2048 | 2048/sana_vae_2048.onnx + .data | 608 MB | float32 |
| VAE 4096 | 4096/sana_vae_4096.onnx + .data | 608 MB | float32 |
Note: DiT must be float32 โ Sana's linear attention produces NaN in fp16.
Demo
- Downloads last month
- -
Model tree for brad-agi/sana-0.6b-onnx-webgpu
Unable to build the model tree, the base model loops to the model itself. Learn more.