Spaces:
Running on Zero
A newer version of the Gradio SDK is available: 6.19.0
title: StreamDiffusionV2 Realtime
emoji: 🌀
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 6.10.0
app_file: app.py
python_version: '3.10'
short_description: Realtime webcam video diffusion (StreamDiffusionV2 / Wan2.1)
startup_duration_timeout: 1h
pinned: false
StreamDiffusionV2 · Realtime Webcam Diffusion
Live ZeroGPU demo of StreamDiffusionV2
(MLSys 2026 Best Paper) on Wan2.1-T2V-1.3B, with a custom gradio.Server
frontend.
Unlike a fixed-length generator, StreamDiffusionV2 is designed for continuous streaming: a causal Diffusion-Transformer with a sink-token-guided rolling KV cache, a motion-aware noise controller, and StreamVAE. Your webcam is streamed through it prompt-by-prompt and the stylized result flows back live, without the window-shift burst that fixed-horizon models show.
The browser captures the webcam and posts frames to a lightweight FastAPI route;
a held @spaces.GPU session runs StreamDiffusionV2's single-GPU streaming loop
(start_stream_session → run_stream_batch) and streams frames back over the
Gradio JS client, paced by a client jitter buffer.
One of three rolling/streaming demos:
- StreamDiffusionV2 (this) — video-to-video webcam.
- LongLive — interactive long text-to-video.
- Rolling Forcing — real-time multi-minute text-to-video.