Spaces:

roll-ai
/

Sci-Fi

Paused

File size: 2,725 Bytes

cf6bed1
 
5506a01
9beb0b2
f7a0825
e6c7152
0ec325f
 
41fd36a
 
0ec325f
 
 
 
ff3ce31
0ec325f
 
aba3f1c
0ec325f
 
52a4f7b
0ec325f
 
 
 
9a10e82
0ec325f
 
3b9eaca
0ec325f
 
52a4f7b
0ec325f
 
41fd36a
0ec325f
9a10e82
0ec325f
 
9a10e82
0ec325f
41fd36a
52a4f7b
0ec325f
41fd36a
 
0ec325f
9a10e82
41fd36a
 
9a10e82
41fd36a
 
52a4f7b
0ec325f
 
 
 
940e06b
e6c7152
 
0e5e0f5
e6c7152
 
b370f23
e6c7152
 
 
b240626
5a062ee
e6c7152

## Sci-Fi: Symmetric Constraint for Frame Inbetweening
<h5>Liuhan Chen<sup>1</sup>, <a href='https://vinthony.github.io'>Xiaodong Cun</a><sup>2,*</sup>, <a href='https://xiaoyu258.github.io/'>Xiaoyu Li</a><sup>3</sup>, Xianyi He<sup>1,4</sup>, Shenghai Yuan<sup>1,4</sup>,  Jie Chen<sup>1</sup>, Ying Shan<sup>3</sup>, Li Yuan<sup>1,*</sup></h5>

<sup>1</sup>Shenzhen Graduate School, Peking University &nbsp;&nbsp;&nbsp; <sup>2</sup><a href='https://gvclab.github.io'>GVC Lab, Great Bay University</a>  &nbsp;&nbsp;&nbsp; 
<sup>3</sup>ARC Lab, Tencent PCG &nbsp;&nbsp;&nbsp; <sup>4</sup>Rabbitpre Intelligence

<table class="center">
    <tr style="font-weight: bolder;text-align:center;">
        <td>Start frame</td>
        <td>End frame</td>
        <td>Generated video</td>
    </tr>
  	<tr>
	  <td>
	    <img src=example_input_pairs/input_pair1/start.jpg width="250">
	  </td>
	  <td>
	    <img src=example_input_pairs/input_pair1/end.jpg width="250">
	  </td>
	  <td>
	    <img src=example_output_gifs/input_pair1.gif width="250" loop="infinite">
	  </td>
  	</tr>
  	<tr>
	  <td>
	    <img src=example_input_pairs/input_pair2/start.jpg width="250">
	  </td>
	  <td>
	    <img src=example_input_pairs/input_pair2/end.jpg width="250">
	  </td>
	  <td>
     	    <img src=example_output_gifs/input_pair2.gif width="250">
	  </td>
  	</tr>
         <tr>
	  <td>
	    <img src=example_input_pairs/input_pair3/start.jpg width="250">
	  </td>
	  <td>
	    <img src=example_input_pairs/input_pair3/end.jpg width="250">
	  </td>
	  <td>
     	    <img src=example_output_gifs/input_pair3.gif width="250">
	  </td>
  	</tr>
	<tr>
	  <td>
	    <img src=example_input_pairs/input_pair4/start.jpg width="250">
	  </td>
	  <td>
	    <img src=example_input_pairs/input_pair4/end.jpg width="250">
	  </td>
	  <td>
     	    <img src=example_output_gifs/input_pair4.gif width="250">
	  </td>
  	</tr>
</table >

## Deployment for frame inbetweening
### 1. Setup repository and environment
```
git clone https://github.com/GVCLab/Sci-Fi.git
cd Sci-Fi
conda create -n Sci-Fi python==3.12
conda activate Sci-Fi
pip install -r requirements.txt
```
### 2. Download checkpoint
Download the CogVideoX-5B-I2V model (due to fine-tuning, the weights of the transformer denoiser are different from the original) and EF-Net.
The weights are available at [🤗HuggingFace](https://huggingface.co/LiuhanChen/Sci-Fi) and [🤖ModelScope](https://www.modelscope.cn/models/clhxclh/Sci-Fi).

### 3. Launch the inference script!
The example input keyframe pairs are in `examples/` folder, and 
the corresponding generated videos (720x480, 49 frames) are placed in `outputs/` folder.
</br>
To interpolate, run:
```
bash Sci_Fi_frame_inbetweening.sh
```