Diffusers
Safetensors
HikariDawn commited on
Commit
7772e38
·
verified ·
1 Parent(s): 394cc0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -1
README.md CHANGED
@@ -1,3 +1,65 @@
1
  ---
2
  license: gpl-3.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: gpl-3.0
3
+ ---
4
+
5
+
6
+ <div align="center">
7
+
8
+ # Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
9
+
10
+ </div>
11
+
12
+
13
+ <div align="center">
14
+ <a href=https://uva-computer-vision-lab.github.io/Frame-In-N-Out/ target="_blank"><img src=https://img.shields.io/badge/Project%20Page-333399.svg?logo=homepage height=22px></a>
15
+ <a href=https://huggingface.co/collections/uva-cv-lab/frame-in-n-out target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
16
+ <a href=https://huggingface.co/datasets/uva-cv-lab/FrameINO_data target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Dataset-276cb4.svg height=22px></a>
17
+ <a href=https://github.com/UVA-Computer-Vision-Lab/FrameINO target="_blank"><img src= https://img.shields.io/badge/Code-black.svg?logo=github height=22px></a>
18
+ <a href=https://arxiv.org/abs/2505.21491 target="_blank"><img src=https://img.shields.io/badge/Arxiv-b5212f.svg?logo=arxiv height=22px></a>
19
+ </div>
20
+
21
+ <br>
22
+
23
+
24
+
25
+ ## Intro Video
26
+ <p align="center">
27
+ <video
28
+ controls
29
+ autoplay
30
+ playsinline
31
+ muted
32
+ loop
33
+ src="https://github.com/user-attachments/assets/0fabd2a4-9d3b-4148-bc04-6fc03c53caca"
34
+ width="60%"
35
+ >
36
+ </video>
37
+ <br>
38
+ <em> Frame In-N-Out is a controllable Image-to-Video generation Diffusion Transformer model where objects can enter or exit the scene along user-specified motion trajectories and ID reference. Our method introduces a new dataset curation pattern recognition, evaluation protocol, and a <b>motion-controllable</b>, <b>identity-preserving</b>, <b>unbounded canvas</b> Video Diffusion Transformer, to achieve Frame In and Frame Out in the cinematic domain. </em>
39
+ </p>
40
+
41
+
42
+
43
+
44
+
45
+
46
+ ## Model Zoo 🤗
47
+ | Model | Description | Huggingface |
48
+ |--------------------------------------------------------------- | -------------------------------| ------------------------------------------------------------------------------------------------|
49
+ | CogVideoX-I2V-5B V1.0 (Stage 1 - Motion Control) | Paper Weight v1.0 | [Download](https://huggingface.co/uva-cv-lab/FrameINO_CogVideoX_Stage1_Motion_v1.0) |
50
+ | CogVideoX-I2V-5B (Stage 2 - Motion + In-N-Out Control) | Paper Weight v1.0 | [Download](https://huggingface.co/uva-cv-lab/FrameINO_CogVideoX_Stage2_MotionINO_v1.0) |
51
+ | Wan2.2-TI2V-5B (Stage 1 - Motion Control) | New Weight v1.5 on 704P | [Download](https://huggingface.co/uva-cv-lab/FrameINO_Wan2.2_5B_Stage1_Motion_v1.5) |
52
+ | Wan2.2-TI2V-5B (Stage 2 - Motion + In-N-Out Control) | New Weight v1.5 on 704P | [Download](https://huggingface.co/uva-cv-lab/FrameINO_Wan2.2_5B_Stage2_MotionINO_v1.5) |
53
+
54
+
55
+
56
+ ## 📚 Citation
57
+ ```bibtex
58
+ @article{wang2025frame,
59
+ title={Frame In-N-Out: Unbounded Controllable Image-to-Video Generation},
60
+ author={Wang, Boyang and Chen, Xuweiyi and Gadelha, Matheus and Cheng, Zezhou},
61
+ journal={arXiv preprint arXiv:2505.21491},
62
+ year={2025}
63
+ }
64
+ ```
65
+