File size: 1,373 Bytes
1fc82bd
 
e88b235
 
 
1fc82bd
e88b235
1fc82bd
e88b235
 
 
1fc82bd
 
 
e88b235
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
title: StreamDiffusionV2 Realtime
emoji: 🌀
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 6.10.0
app_file: app.py
python_version: "3.10"
short_description: Realtime webcam video diffusion (StreamDiffusionV2 / Wan2.1)
startup_duration_timeout: 1h
pinned: false
---

# StreamDiffusionV2 · Realtime Webcam Diffusion

Live ZeroGPU demo of [**StreamDiffusionV2**](https://streamdiffusionv2.github.io/)
(MLSys 2026 Best Paper) on **Wan2.1-T2V-1.3B**, with a custom `gradio.Server`
frontend.

Unlike a fixed-length generator, StreamDiffusionV2 is **designed for continuous
streaming**: a causal Diffusion-Transformer with a **sink-token-guided rolling KV
cache**, a motion-aware noise controller, and StreamVAE. Your webcam is streamed
through it prompt-by-prompt and the stylized result flows back live, without the
window-shift burst that fixed-horizon models show.

The browser captures the webcam and posts frames to a lightweight FastAPI route;
a held `@spaces.GPU` session runs StreamDiffusionV2's single-GPU streaming loop
(`start_stream_session``run_stream_batch`) and streams frames back over the
Gradio JS client, paced by a client jitter buffer.

One of three rolling/streaming demos:
- StreamDiffusionV2 (this) — video-to-video webcam.
- LongLive — interactive long text-to-video.
- Rolling Forcing — real-time multi-minute text-to-video.