Spaces:
Running on Zero
Running on Zero
| title: StreamDiffusionV2 Realtime | |
| emoji: 🌀 | |
| colorFrom: indigo | |
| colorTo: pink | |
| sdk: gradio | |
| sdk_version: 6.10.0 | |
| app_file: app.py | |
| python_version: "3.10" | |
| short_description: Realtime webcam video diffusion (StreamDiffusionV2 / Wan2.1) | |
| startup_duration_timeout: 1h | |
| pinned: false | |
| # StreamDiffusionV2 · Realtime Webcam Diffusion | |
| Live ZeroGPU demo of [**StreamDiffusionV2**](https://streamdiffusionv2.github.io/) | |
| (MLSys 2026 Best Paper) on **Wan2.1-T2V-1.3B**, with a custom `gradio.Server` | |
| frontend. | |
| Unlike a fixed-length generator, StreamDiffusionV2 is **designed for continuous | |
| streaming**: a causal Diffusion-Transformer with a **sink-token-guided rolling KV | |
| cache**, a motion-aware noise controller, and StreamVAE. Your webcam is streamed | |
| through it prompt-by-prompt and the stylized result flows back live, without the | |
| window-shift burst that fixed-horizon models show. | |
| The browser captures the webcam and posts frames to a lightweight FastAPI route; | |
| a held `@spaces.GPU` session runs StreamDiffusionV2's single-GPU streaming loop | |
| (`start_stream_session` → `run_stream_batch`) and streams frames back over the | |
| Gradio JS client, paced by a client jitter buffer. | |
| One of three rolling/streaming demos: | |
| - StreamDiffusionV2 (this) — video-to-video webcam. | |
| - LongLive — interactive long text-to-video. | |
| - Rolling Forcing — real-time multi-minute text-to-video. | |