multimodalart's picture
multimodalart HF Staff
Upload folder using huggingface_hub
e88b235 verified
|
Raw
History Blame Contribute Delete
1.37 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: StreamDiffusionV2 Realtime
emoji: 🌀
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 6.10.0
app_file: app.py
python_version: '3.10'
short_description: Realtime webcam video diffusion (StreamDiffusionV2 / Wan2.1)
startup_duration_timeout: 1h
pinned: false

StreamDiffusionV2 · Realtime Webcam Diffusion

Live ZeroGPU demo of StreamDiffusionV2 (MLSys 2026 Best Paper) on Wan2.1-T2V-1.3B, with a custom gradio.Server frontend.

Unlike a fixed-length generator, StreamDiffusionV2 is designed for continuous streaming: a causal Diffusion-Transformer with a sink-token-guided rolling KV cache, a motion-aware noise controller, and StreamVAE. Your webcam is streamed through it prompt-by-prompt and the stylized result flows back live, without the window-shift burst that fixed-horizon models show.

The browser captures the webcam and posts frames to a lightweight FastAPI route; a held @spaces.GPU session runs StreamDiffusionV2's single-GPU streaming loop (start_stream_sessionrun_stream_batch) and streams frames back over the Gradio JS client, paced by a client jitter buffer.

One of three rolling/streaming demos:

  • StreamDiffusionV2 (this) — video-to-video webcam.
  • LongLive — interactive long text-to-video.
  • Rolling Forcing — real-time multi-minute text-to-video.