KlingTeam
/

ShotStream

Text-to-Video

English

Model card Files Files and versions

xet

Community

Improve model card and add pipeline tag

by nielsr HF Staff - opened Mar 31

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+48

-5

Files changed (1) hide show

README.md +48 -5

README.md CHANGED Viewed

@@ -1,9 +1,52 @@
 # ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
-TL;DR: We propose ShotStream, a novel causal multi-shot architecture that enables interactive storytelling and efficient on-the-fly frame generation, achieving 16 FPS on a single NVIDIA GPU.
-Please refer to the [Github](https://github.com/KlingAIResearch/ShotStream/blob/main/README.md) README for usage.
-* Paper：[https://arxiv.org/abs/2603.25746](https://arxiv.org/abs/2603.25746)
-* Project Page：[https://luo0207.github.io/ShotStream/](https://luo0207.github.io/ShotStream/)
-* Code：[https://github.com/KlingAIResearch/ShotStream](https://github.com/KlingAIResearch/ShotStream)

+---
+pipeline_tag: text-to-video
+---
 # ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
+**ShotStream** is a novel causal multi-shot architecture that enables interactive storytelling and efficient on-the-fly frame generation. It achieves sub-second latency and 16 FPS on a single NVIDIA GPU by reformulating the task as next-shot generation conditioned on historical context.
+[**Project Page**](https://luo0207.github.io/ShotStream/) | [**Paper**](https://arxiv.org/abs/2603.25746) | [**Code**](https://github.com/KlingAIResearch/ShotStream)
+## Introduction
+Multi-shot video generation is crucial for long narrative storytelling. ShotStream allows users to dynamically instruct ongoing narratives via streaming prompts. It preserves visual coherence through a dual-cache memory mechanism and mitigates error accumulation using a two-stage distillation strategy (Distribution Matching Distillation).
+## Usage
+For the full implementation and training details, please refer to the [official GitHub repository](https://github.com/KlingAIResearch/ShotStream).
+### 1. Environment Setup
+```bash
+git clone https://github.com/KlingAIResearch/ShotStream.git
+cd ShotStream
+# Setup environment using the provided script
+bash tools/setup/env.sh
+```
+### 2. Download Checkpoints
+```bash
+# Download the checkpoints of Wan-T2V-1.3B and ShotStream
+bash tools/setup/download_ckpt.sh
+```
+### 3. Run Inference
+To perform autoregressive 4-step long multi-shot video generation:
+```bash
+bash tools/inference/causal_fewsteps.sh
+```
+## Citation
+If you find our work helpful, please cite our paper:
+```bibtex
+@article{luo2026shotstream,
+  title={ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling},
+  author={Luo, Yawen and Shi, Xiaoyu and Zhuang, Junhao and Chen, Yutian and Liu, Quande and Wang, Xintao and Pengfei Wan and Xue, Tianfan},
+  journal={arXiv preprint arXiv:2603.25746},
+  year={2026}
+}
+```