| | --- |
| | language: |
| | - en |
| | tags: |
| | - pytorch_model_hub_mixin |
| | - animation |
| | - video-frame-interpolation |
| | - uncertainty-estimation |
| | license: mit |
| | pipeline_tag: image-to-image |
| | --- |
| | |
| | # 🤖 Multi‑Input ResShift Diffusion VFI |
| |
|
| | <div align="left" style="display: flex; flex-direction: row; gap: 15px"> |
| | <a href='https://arxiv.org/pdf/2504.05402'><img src='https://img.shields.io/badge/arXiv-2405.17933-b31b1b.svg'></a> |
| | <a href='https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI'><img src='https://img.shields.io/badge/Repo-Code-blue'></a> |
| | <a href='https://colab.research.google.com/drive/1MGYycbNMW6Mxu5MUqw_RW_xxiVeHK5Aa#scrollTo=EKaYCioiP3tQ'><img src='https://img.shields.io/badge/Colab-Demo-Green'></a> |
| | <a href='https://huggingface.co/spaces/vfontech/Multi-Input-Res-Diffusion-VFI'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face%20Space-Demo-g'></a> |
| | </div> |
| |
|
| | ## ⚙️ Setup |
| |
|
| | Start by downloading the source code directly from GitHub. |
| |
|
| | ```bash |
| | git clone https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI.git |
| | ``` |
| |
|
| | Create a conda environment and install all the requirements |
| |
|
| | ```bash |
| | conda create -n multi-input-resshift python=3.12 |
| | conda activate multi-input-resshift |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| | **Note**: Make sure your system is compatible with **CUDA 12.4**. If not, install [CuPy](https://docs.cupy.dev/en/stable/install.html) according to your current CUDA version. |
| |
|
| | ## 🚀 Inference Example |
| |
|
| | ```python |
| | import os |
| | from PIL import Image |
| | import numpy as np |
| | import matplotlib.pyplot as plt |
| | |
| | from torchvision.transforms import Compose, ToTensor, Resize, Normalize |
| | from utils.utils import denorm |
| | from model.hub import MultiInputResShiftHub |
| | |
| | model = MultiInputResShiftHub.from_pretrained("vfontech/Multiple-Input-Resshift-VFI") |
| | model.requires_grad_(False).cuda().eval() |
| | |
| | img0_path = r"_data\example_images\frame1.png" |
| | img2_path = r"_data\example_images\frame3.png" |
| | |
| | mean = std = [0.5]*3 |
| | transforms = Compose([ |
| | Resize((256, 448)), |
| | ToTensor(), |
| | Normalize(mean=mean, std=std), |
| | ]) |
| | |
| | img0 = transforms(Image.open(img0_path).convert("RGB")).unsqueeze(0).cuda() |
| | img2 = transforms(Image.open(img2_path).convert("RGB")).unsqueeze(0).cuda() |
| | tau = 0.5 |
| | |
| | img1 = model.reverse_process([img0, img2], tau) |
| | |
| | plt.figure(figsize=(10, 5)) |
| | plt.subplot(1, 3, 1) |
| | plt.imshow(denorm(img0, mean=mean, std=std).squeeze().permute(1, 2, 0).cpu().numpy()) |
| | plt.subplot(1, 3, 2) |
| | plt.imshow(denorm(img1, mean=mean, std=std).squeeze().permute(1, 2, 0).cpu().numpy()) |
| | plt.subplot(1, 3, 3) |
| | plt.imshow(denorm(img2, mean=mean, std=std).squeeze().permute(1, 2, 0).cpu().numpy()) |
| | plt.show() |
| | ``` |