uva-cv-lab
/

FrameINO_CogVideoX_Stage1_Motion_v1.0

Diffusers

Safetensors

Model card Files Files and versions

xet

Community

HikariDawn commited on Nov 2

Commit

7772e38

verified ·

1 Parent(s): 394cc0a

Update README.md

Browse files

Files changed (1) hide show

README.md +63 -1

README.md CHANGED Viewed

@@ -1,3 +1,65 @@
 ---
 license: gpl-3.0
----

 ---
 license: gpl-3.0
+---
+<div align="center">
+# Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
+</div>
+<div align="center">
+  <a href=https://uva-computer-vision-lab.github.io/Frame-In-N-Out/ target="_blank"><img src=https://img.shields.io/badge/Project%20Page-333399.svg?logo=homepage height=22px></a>
+  <a href=https://huggingface.co/collections/uva-cv-lab/frame-in-n-out target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
+  <a href=https://huggingface.co/datasets/uva-cv-lab/FrameINO_data  target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Dataset-276cb4.svg height=22px></a>
+  <a href=https://github.com/UVA-Computer-Vision-Lab/FrameINO target="_blank"><img src= https://img.shields.io/badge/Code-black.svg?logo=github height=22px></a>
+  <a href=https://arxiv.org/abs/2505.21491 target="_blank"><img src=https://img.shields.io/badge/Arxiv-b5212f.svg?logo=arxiv height=22px></a>
+</div>
+<br>
+## Intro Video
+<p align="center">
+  <video
+    controls
+    autoplay
+    playsinline
+    muted
+    loop
+    src="https://github.com/user-attachments/assets/0fabd2a4-9d3b-4148-bc04-6fc03c53caca"
+    width="60%"
+  >
+  </video>
+  <br>
+  <em> Frame In-N-Out is a controllable Image-to-Video generation Diffusion Transformer model where objects can enter or exit the scene along user-specified motion trajectories and ID reference. Our method introduces a new dataset curation pattern recognition, evaluation protocol, and a <b>motion-controllable</b>, <b>identity-preserving</b>, <b>unbounded canvas</b> Video Diffusion Transformer, to achieve Frame In and Frame Out in the cinematic domain. </em>
+</p>
+## Model Zoo 🤗
+| Model                                                          | Description                    | Huggingface                                                                                     |
+|--------------------------------------------------------------- | -------------------------------| ------------------------------------------------------------------------------------------------|
+| CogVideoX-I2V-5B V1.0 (Stage 1 - Motion Control)               |  Paper Weight v1.0             |     [Download](https://huggingface.co/uva-cv-lab/FrameINO_CogVideoX_Stage1_Motion_v1.0)         |
+| CogVideoX-I2V-5B  (Stage 2 - Motion + In-N-Out Control)        |  Paper Weight v1.0             |     [Download](https://huggingface.co/uva-cv-lab/FrameINO_CogVideoX_Stage2_MotionINO_v1.0)      |
+| Wan2.2-TI2V-5B  (Stage 1 - Motion Control)                     |  New Weight v1.5 on 704P       |     [Download](https://huggingface.co/uva-cv-lab/FrameINO_Wan2.2_5B_Stage1_Motion_v1.5)         |
+| Wan2.2-TI2V-5B  (Stage 2 - Motion + In-N-Out Control)          |  New Weight v1.5 on 704P       |     [Download](https://huggingface.co/uva-cv-lab/FrameINO_Wan2.2_5B_Stage2_MotionINO_v1.5)      |
+## 📚 Citation
+```bibtex
+@article{wang2025frame,
+  title={Frame In-N-Out: Unbounded Controllable Image-to-Video Generation},
+  author={Wang, Boyang and Chen, Xuweiyi and Gadelha, Matheus and Cheng, Zezhou},
+  journal={arXiv preprint arXiv:2505.21491},
+  year={2025}
+}
+```