daydreamlive
/

scope-vace

+---
+arxiv: "2602.14381"
+tags:
+  - video-generation
+  - vace
+  - real-time
+  - autoregressive
+  - diffusion
+  - wan
+license: apache-2.0
+---
+# Adapting VACE for Real-Time Autoregressive Video Diffusion
+This is the companion model card for the paper [Adapting VACE for Real-Time Autoregressive Video Diffusion](https://arxiv.org/abs/2602.14381).
+## Overview
+This work presents modifications to [VACE](https://github.com/ali-vilab/VACE) that enable real-time autoregressive generation. The original VACE system uses bidirectional attention across full sequences, which is incompatible with streaming requirements. The key innovation moves reference frames from the diffusion latent space into a parallel conditioning pathway, maintaining fixed chunk sizes and KV caching needed for autoregressive models.
+The adaptation leverages existing pretrained weights without retraining. Testing across 1.3B and 14B model scales shows structural control adds 20-30% latency overhead with minimal memory costs.
+## Real-Time Demo
+Resolume Arena as live input into Scope via Spout:
+<video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/resolume.mp4" controls autoplay loop muted></video>
+## VACE Control Examples
+These comparisons show the adapted VACE conditioning across different control modes (corresponding to figures in the paper):
+| Control Mode | Video |
+|---|---|
+| Depth | <video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/depth_comparison.mp4" controls loop muted width="400"></video> |
+| Scribble | <video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/scribble_comparison.mp4" controls loop muted width="400"></video> |
+| Optical Flow | <video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/optical_flow_comparison.mp4" controls loop muted width="400"></video> |
+| Image-to-Video | <video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/i2v_comparison.mp4" controls loop muted width="400"></video> |
+| Inpainting | <video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/inpainting_comparison.mp4" controls loop muted width="400"></video> |
+| Outpainting | <video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/outpainting_comparison.mp4" controls loop muted width="400"></video> |
+| Layout | <video src="https://huggingface.co/ryanontheinside/scope-vace/resolve/main/videos/layout_comparison.mp4" controls loop muted width="400"></video> |
+## Reference Implementation
+The reference implementation is available in [Daydream Scope](https://github.com/daydreamlive/scope), a tool for running real-time, interactive generative AI video pipelines.
+## Author
+[ryanontheinside.com](https://ryanontheinside.com)
+## Citation
+```bibtex
+@article{fosdick2026adapting,
+  title={Adapting VACE for Real-Time Autoregressive Video Diffusion},
+  author={Fosdick, Ryan},
+  journal={arXiv preprint arXiv:2602.14381},
+  year={2026}
+}
+```