File size: 6,431 Bytes
664218a 93e9ddf 664218a 93e9ddf 664218a f714e98 664218a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 | ---
license: apache-2.0
language:
- en
tags:
- video
- video-generation
- video-to-video
- diffusers
- wan2.2
---
# Wan2.2 Video Continuation (Demo)
#### *The current project is still in development.
This repo contains the code for video continuation inference using [Wan2.2](https://github.com/Wan-Video/Wan2.2).
The main idea was taken from [LongCat-Video](https://huggingface.co/meituan-longcat/LongCat-Video).
Demo example (Only the first 32 frames are original; the rest are generated)
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/63fde49f6315a264aba6a7ed/fPm3hJ9SlZ-29ncWZHygW.mp4"></video>
## Description
This is simple lora for Wan2.2TI transformer.
First test - rank = 64, alpha = 128.
It was trained using around 10k video. Input video frames 16-64 and output video frames 41-81.
Mostly attention processor has been changed for this approach.
See <a href="https://github.com/TheDenk/wan2.2-video-continuation">Github code</a>.
### Models
| Model | Best input frames count | Best output frames count | Resolution | Huggingface Link |
|-------|:-----------:|:------------------:|:------------------:|:------------------:|
| TI2V-5B | 24-32-40 | 49-61-81 | 704x1280| [Link](https://huggingface.co/TheDenk/wan2.2-video-continuation) |
### How to
Clone repo
```bash
git clone https://github.com/TheDenk/wan2.2-video-continuation
cd wan2.2-video-continuation
```
Create venv
```bash
python -m venv venv
source venv/bin/activate
```
Install requirements
```bash
pip install git+https://github.com/huggingface/diffusers.git
pip install -r requirements.txt
```
### Inference examples
#### Simple inference with cli
#### Gradio inference
```bash
python -m inference.gradio_web_demo \
--base_model_path Wan-AI/Wan2.2-TI2V-5B-Diffusers \
--lora_path TheDenk/wan2.2-video-continuation
```
```bash
python -m inference.cli_demo \
--video_path "resources/ship.mp4" \
--num_input_frames 24 \
--num_output_frames 81 \
--prompt "Watercolor style, the wet suminagashi inks slowly spread into the shape of an island on the paper, with the edges continuously blending into delicate textural variations. A tiny paper boat floats in the direction of the water flow towards the still-wet areas, creating subtle ripples around it. Centered composition with soft natural light pouring in from the side, revealing subtle color gradations and a sense of movement." \
--base_model_path Wan-AI/Wan2.2-TI2V-5B-Diffusers \
--lora_path TheDenk/wan2.2-video-continuation
```
#### Detailed Inference
```bash
python -m inference.cli_demo \
--video_path "resources/ship.mp4" \
--num_input_frames 24 \
--num_output_frames 81 \
--prompt "Watercolor style, the wet suminagashi inks slowly spread into the shape of an island on the paper, with the edges continuously blending into delicate textural variations. A tiny paper boat floats in the direction of the water flow towards the still-wet areas, creating subtle ripples around it. Centered composition with soft natural light pouring in from the side, revealing subtle color gradations and a sense of movement." \
--base_model_path Wan-AI/Wan2.2-TI2V-5B-Diffusers \
--lora_path TheDenk/wan2.2-video-continuation \
--num_inference_steps 50 \
--guidance_scale 5.0 \
--video_height 480 \
--video_width 832 \
--negative_prompt "bad quality, low quality" \
--seed 42 \
--out_fps 24 \
--output_path "result.mp4" \
--teacache_treshold 0.5
```
#### Minimal code example
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"
os.environ["TOKENIZERS_PARALLELISM"] = "false"
import torch
from diffusers.utils import load_video, export_to_video
from diffusers import AutoencoderKLWan, UniPCMultistepScheduler
from wan_continuous_transformer import WanTransformer3DModel
from wan_continuous_pipeline import WanContinuousVideoPipeline
base_model_path = "Wan-AI/Wan2.2-TI2V-5B-Diffusers"
lora_path = "TheDenk/wan2.2-video-continuation"
vae = AutoencoderKLWan.from_pretrained(base_model_path, subfolder="vae", torch_dtype=torch.float32)
transformer = WanTransformer3DModel.from_pretrained(base_model_path, subfolder="transformer", torch_dtype=torch.bfloat16)
pipe = WanContinuousVideoPipeline.from_pretrained(
pretrained_model_name_or_path=base_model_path,
transformer=transformer,
vae=vae,
torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
pipe.transformer.load_lora_adapter(
lora_path,
weight_name="pytorch_lora_weights.safetensors",
adapter_name="video_continuation",
prefix=None,
)
pipe.set_adapters("video_continuation", adapter_weights=1.0)
img_h = 480 # 704 512 480
img_w = 832 # 1280 832 768
num_input_frames = 24 # 16 24 32
num_output_frames = 81 # 81 49
video_path = 'ship.mp4'
previous_video = load_video(video_path)[-num_input_frames:]
prompt = "Watercolor style, the wet suminagashi inks slowly spread into the shape of an island on the paper, with the edges continuously blending into delicate textural variations. A tiny paper boat floats in the direction of the water flow towards the still-wet areas, creating subtle ripples around it. Centered composition with soft natural light pouring in from the side, revealing subtle color gradations and a sense of movement."
negative_prompt = "bad quality, low quality"
output = pipe(
previous_video=previous_video,
prompt=prompt,
negative_prompt=negative_prompt,
height=img_h,
width=img_w,
num_frames=num_output_frames,
guidance_scale=5,
generator=torch.Generator(device="cuda").manual_seed(42),
output_type="pil",
teacache_treshold=0.4,
).frames[0]
export_to_video(output, "output.mp4", fps=16)
```
## Acknowledgements
Original code and models [Wan2.2](https://github.com/Wan-Video/Wan2.2).
Video continuation approach from [LongCat-Video](https://huggingface.co/meituan-longcat/LongCat-Video).
Increase inference speed with [TeaCache](https://github.com/ali-vilab/TeaCache)
## Citations
```
@misc{TheDenk,
title={Wan2.2 Video Continuation},
author={Karachev Denis},
url={https://github.com/TheDenk/wan2.2-video-continuation},
publisher={Github},
year={2025}
}
```
## Contacts
<p>Issues should be raised directly in the repository. For professional support and recommendations please <a>welcomedenk@gmail.com</a>.</p> |