Eddie0521 commited on
Commit
939aeb1
Β·
verified Β·
1 Parent(s): 37235f7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center" style="border-radius: 10px">
2
+ <img src="assets/icon+name.png" width="50%" alt="logo"/>
3
+ </p>
4
+
5
+ # <div align="center" >Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory<div align="center">
6
+
7
+ <div align="center">
8
+ <p>
9
+ <a href="https://eddie0521.github.io/">Jinzhuo Liu</a><sup>1</sup>,
10
+ <a href="https://zhangzjn.github.io">Jiangning Zhang</a><sup>1<a href="mailto:186368@zju.edu.cn">βœ‰</a></sup>,
11
+ <a href="https://github.com/Rinke02">Wencan Jiang</a><sup>1</sup>,
12
+ <a href="https://scholar.google.com/citations?user=xiK4nFUAAAAJ&hl=zh-CN">Yabiao Wang</a><sup>2</sup>,
13
+ <a href="https://dk-liang.github.io/">Dingkang Liang</a><sup>3</sup>,
14
+ <a href="https://scholar.google.com/citations?user=m3KDreEAAAAJ&hl=en">Zhucun Xue</a><sup>1</sup>,
15
+ <a href="https://yiranran.github.io/">Ran Yi</a><sup>4</sup>,
16
+ <a href="https://person.zju.edu.cn/yongliu">Yong Liu</a><sup>1</sup>
17
+ </p>
18
+ <p>
19
+ <sup>1</sup>Zhejiang University, &nbsp;&nbsp;
20
+ <sup>2</sup>Tencent Youtu Lab, &nbsp;&nbsp;
21
+ <sup>3</sup>Huazhong University of Science and Technology,<br>
22
+ <sup>4</sup>Shanghai Jiao Tong University &nbsp;&nbsp;
23
+ <sup><a href="mailto:186368@zju.edu.cn">βœ‰</a></sup>Corresponding author
24
+ </p>
25
+ </div>
26
+ <p align="center">
27
+ <a href="https://eddie0521.github.io/projects/iamflow/"><img src="https://img.shields.io/badge/Project-Page-Green"></a>
28
+ &nbsp;
29
+ <a href="https://arxiv.org/abs/2605.18733"><img src="https://img.shields.io/static/v1?label=arXiv&message=2605.18733&color=red&logo=arxiv"></a>
30
+ &nbsp;
31
+ <a href="https://huggingface.co/Eddie0521/IAMFlow-FP8"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-orange"></a>
32
+ </p>
33
+
34
+ ## πŸ”₯ Updates
35
+
36
+ - __[2026.05.15]__: We release the [github repo](https://github.com/Eddie0521/IAMFlow), the [project page](https://eddie0521.github.io/projects/iamflow/), the quantized [model checkpoints](https://huggingface.co/Eddie0521/IAMFlow-FP8), the [NarraStream-Bench](https://github.com/Eddie0521/NarraStream-Bench), and the [paper](https://arxiv.org/abs/2605.18733).
37
+
38
+ ## πŸ“· Introduction
39
+ πŸ’‘**TL;DR:**
40
+ [IAMFlow](https://arxiv.org/abs/2605.18733) uses explicit identity-aware memory to keep identities consistent across evolving narrative prompts, achieving faster and stronger long video generation on [NarraStream-Bench](https://arxiv.org/abs/2605.18733).
41
+
42
+
43
+ ## ✨ Highlights
44
+ 1. We introduce [**IAMFlow**](https://arxiv.org/abs/2605.18733), a training-free identity-aware memory framework that explicitly organizes historical information around persistent entities and attributes, enabling reliable identity preservation across evolving prompt transitions.
45
+ 2. We design a systematic inference acceleration pipeline to make the framework computationally practical, combining asynchronous visual verification, adaptive prompt transition, and model quantization to preserve long-term consistency without sacrificing generation speed.
46
+ 3. We introduce [**NarraStream-Bench**](https://arxiv.org/abs/2605.18733), a modern benchmark suite for assessing long-term consistency in narrative streaming video generation. Extensive experiments and ablation studies demonstrate that IAMFlow achieves superior performance across various metrics while enabling more efficient inference.
47
+
48
+ ## πŸ› οΈ Installation
49
+ ### 1. Install Requirements
50
+
51
+ ```
52
+ git clone git@github.com:Eddie0521/IAMFlow.git
53
+ cd IAMFlow
54
+ conda create -n iamflow python=3.12 -y
55
+ conda activate iamflow
56
+
57
+ # Install PyTorch first according to your CUDA environment.
58
+ python -m pip install torch==2.9.1 torchvision==0.24.1
59
+ python -m pip install -r requirements.txt
60
+ pip install flash-attn --no-build-isolation
61
+ ```
62
+
63
+ ### 2. Download Checkpoints
64
+ Download models using hf:
65
+ ``` sh
66
+ pip install "huggingface_hub[cli]"
67
+ hf download Wan-AI/Wan2.1-T2V-1.3B --local-dir pretrained/Wan2.1-T2V-1.3B
68
+ hf download Eddie0521/IAMFlow --local-dir pretrained/iamflow_models
69
+ hf download Qwen/Qwen3-VL-2B-Instruct --local-dir pretrained/Qwen3-VL-2B-Instruct
70
+ hf download Qwen/Qwen3-4B-Instruct-2507 --local-dir pretrained/Qwen3-4B-Instruct-2507
71
+ ```
72
+
73
+ ## πŸ”‘ Inference
74
+ We deploy DiT, TextEncoder, and LLM on one GPU, while VAE and VLM are deployed on another GPU.
75
+
76
+ ```sh
77
+ bash ./scripts/run_iamflow.sh
78
+ ```
79
+
80
+
81
+ ## πŸ“ Evaluation & Benchmark
82
+ See the [NarraStream-Bench](https://github.com/Eddie0521/NarraStream-Bench).
83
+
84
+ ## πŸ€— Acknowledgement
85
+ - [MemFlow](https://github.com/KlingAIResearch/MemFlow): the codebase we built upon. Thanks for their wonderful work.
86
+ - [Self-Forcing](https://github.com/guandeh17/Self-Forcing): the algorithm we built upon. Thanks for their wonderful work.
87
+ - [Wan](https://github.com/Wan-Video/Wan2.1): the base model we built upon. Thanks for their wonderful work.
88
+
89
+ ## 🌟 Citation
90
+ Please leave us a star 🌟 and cite our paper if you find our work helpful.
91
+
92
+ ```
93
+ @misc{liu2026advancingnarrativelongvideo,
94
+ title={Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory},
95
+ author={Jinzhuo Liu and Jiangning Zhang and Wencan Jiang and Yabiao Wang and Dingkang Liang and Zhucun Xue and Ran Yi and Yong Liu},
96
+ year={2026},
97
+ eprint={2605.18733},
98
+ archivePrefix={arXiv},
99
+ primaryClass={cs.CV},
100
+ url={https://arxiv.org/abs/2605.18733},
101
+ }
102
+ ```