nielsr HF Staff commited on
Commit
c28d9ca
·
verified ·
1 Parent(s): 178903b

Add model card for Kiwi-Edit

Browse files

Hi! I'm Niels, part of the community science team at Hugging Face. I noticed this repository was missing a model card, so I've opened this PR to add one.

The model card includes:
- Metadata for the `diffusers` library and the `image-to-video` pipeline tag.
- Links to the original paper, project page, and GitHub repository.
- A brief description of the model's capabilities (instruction and reference-guided video editing).
- CLI usage instructions for running the model with Diffusers, based on the official repository.

This information helps users discover and use your work more effectively on the Hugging Face Hub.

Files changed (1) hide show
  1. README.md +48 -0
README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: diffusers
3
+ pipeline_tag: image-to-video
4
+ ---
5
+
6
+ # Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance
7
+
8
+ Kiwi-Edit is a versatile video editing framework built on an MLLM encoder and a video Diffusion Transformer (DiT). It supports both instruction-based video editing and reference-guided editing (using a reference image and instruction).
9
+
10
+ - **Paper:** [Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance](https://huggingface.co/papers/2603.02175)
11
+ - **Project Page:** [https://showlab.github.io/Kiwi-Edit/](https://showlab.github.io/Kiwi-Edit/)
12
+ - **Repository:** [https://github.com/showlab/Kiwi-Edit](https://github.com/showlab/Kiwi-Edit)
13
+
14
+ ## Model Description
15
+
16
+ Kiwi-Edit introduces a unified editing architecture that synergizes learnable queries and latent visual features for reference semantic guidance. It addresses the challenge of precise visual control in instruction-based editing by allowing users to provide a reference image to guide the transformation. The framework achieves significant performance improvements in instruction following and reference fidelity through a scalable data generation pipeline and a multi-stage training curriculum.
17
+
18
+ ## Usage
19
+
20
+ This model is compatible with the `diffusers` library. To run inference, follow the installation instructions in the [official repository](https://github.com/showlab/Kiwi-Edit).
21
+
22
+ ### Quick Test with Diffusers
23
+
24
+ You can run a quick test on a demo video using the following command provided in the repository:
25
+
26
+ ```bash
27
+ python diffusers_demo.py \
28
+ --video_path ./demo_data/video/source/0005e4ad9f49814db1d3f2296b911abf.mp4 \
29
+ --prompt "Remove the monkey." \
30
+ --save_path output.mp4 \
31
+ --model_path linyq/kiwi-edit-5b-instruct-only-diffusers
32
+ ```
33
+
34
+ ## Citation
35
+
36
+ If you find this work useful, please cite:
37
+
38
+ ```bibtex
39
+ @misc{kiwiedit,
40
+ title={Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance},
41
+ author={Yiqi Lin and Guoqiang Liang and Ziyun Zeng and Zechen Bai and Yanzhe Chen and Mike Zheng Shou},
42
+ year={2026},
43
+ eprint={2603.02175},
44
+ archivePrefix={arXiv},
45
+ primaryClass={cs.CV},
46
+ url={https://arxiv.org/abs/2603.02175},
47
+ }
48
+ ```