| ## Sci-Fi: Symmetric Constraint for Frame Inbetweening | |
| <h5>Liuhan Chen<sup>1</sup>, <a href='https://vinthony.github.io'>Xiaodong Cun</a><sup>2,*</sup>, <a href='https://xiaoyu258.github.io/'>Xiaoyu Li</a><sup>3</sup>, Xianyi He<sup>1,4</sup>, Shenghai Yuan<sup>1,4</sup>, Jie Chen<sup>1</sup>, Ying Shan<sup>3</sup>, Li Yuan<sup>1,*</sup></h5> | |
| <sup>1</sup>Shenzhen Graduate School, Peking University <sup>2</sup><a href='https://gvclab.github.io'>GVC Lab, Great Bay University</a> | |
| <sup>3</sup>ARC Lab, Tencent PCG <sup>4</sup>Rabbitpre Intelligence | |
| We have updated our paper with a new version and chane the name of our framework from Sci-Fi to EF-VI. | |
| **[Arxiv](https://arxiv.org/abs/2505.21205) | [PDF](https://arxiv.org/pdf/2505.21205)** | |
| ## Video demos | |
| [](https://youtu.be/_YfFH-uNYQk) | |
| [or click here to download the compressed version](overview/video_demos.mp4) | |
| ## Method comparison | |
| <div align="center"> | |
| <img src="overview/comparison.png" width="720" alt="Comparison"> | |
| </div> | |
| <strong>(a)</strong> In current I2V-DM-based methods, the end-frame constraint is weaker than the start-frame constraint due to the same injection mechanism but a smaller training scale, causing a distorted predicted path with collapsed content.<br><be> <strong>(b)</strong> Our Sci-Fi maintains start frame processing while enhancing end-frame constraint injection. This achieves symmetric start-end-frame constraints with small training, yielding a fine predicted path close to the real one with smoother inbetweening. | |
| ## Some challenging examples of our Sci-Fi for frame inbetweening. | |
| <table class="center"> | |
| <tr style="font-weight: bolder;text-align:center;"> | |
| <td>Start frame</td> | |
| <td>End frame</td> | |
| <td>Generated video</td> | |
| </tr> | |
| <tr> | |
| <td> | |
| <img src=example_input_pairs/input_pair1/start.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_input_pairs/input_pair1/end.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_output_gifs/input_pair1.gif width="250" loop="infinite"> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| <img src=example_input_pairs/input_pair2/start.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_input_pairs/input_pair2/end.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_output_gifs/input_pair2.gif width="250"> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| <img src=example_input_pairs/input_pair3/start.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_input_pairs/input_pair3/end.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_output_gifs/input_pair3.gif width="250"> | |
| </td> | |
| </tr> | |
| <tr> | |
| <td> | |
| <img src=example_input_pairs/input_pair4/start.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_input_pairs/input_pair4/end.jpg width="250"> | |
| </td> | |
| <td> | |
| <img src=example_output_gifs/input_pair4.gif width="250"> | |
| </td> | |
| </tr> | |
| </table > | |
| ## Deployment for personal use | |
| ### 1. Setup the repository and environment | |
| ``` | |
| git clone https://github.com/GVCLab/Sci-Fi.git | |
| cd Sci-Fi | |
| conda create -n Sci-Fi python==3.12 | |
| conda activate Sci-Fi | |
| pip install -r requirements.txt | |
| ``` | |
| ### 2. Download checkpoint | |
| Download the CogVideoX-5B-I2V model (due to fine-tuning, the weights of the transformer denoiser are different from the original) and EF-Net. | |
| The weights are available at [🤗HuggingFace](https://huggingface.co/LiuhanChen/Sci-Fi) and [🤖ModelScope](https://www.modelscope.cn/models/clhxclh/Sci-Fi). | |
| ### 3. Launch the inference script! | |
| The example input keyframe pairs are in `examples/` folder, and | |
| the corresponding generated videos (720x480, 49 frames) are placed in `outputs/` folder. | |
| </br> | |
| To interpolate, run: | |
| ``` | |
| bash Sci_Fi_frame_inbetweening.sh | |
| ``` | |
| ## Citation | |
| 🌟 If you find our work helpful, please leave us a star and cite our paper. | |
| ``` | |
| @article{chen2025sci, | |
| title={Sci-Fi: Symmetric Constraint for Frame Inbetweening}, | |
| author={Chen, Liuhan and Cun, Xiaodong and Li, Xiaoyu and He, Xianyi and Yuan, Shenghai and Chen, Jie and Shan, Ying and Yuan, Li}, | |
| journal={arXiv preprint arXiv:2505.21205}, | |
| year={2025} | |
| } | |
| ``` | |