File size: 4,642 Bytes

40ed23d
 
 
 
 
 
28b629a
 
 
c7cf16c
 
 
 
4a9011c
0a46fb2
4a9011c
 
 
 
 
0a46fb2
 
4a9011c
 
0a46fb2
 
4a9011c
 
 
0a46fb2
 
4a9011c

---
license: mit
language:
- en
base_model:
- Qwen/Qwen-Image-Edit-2509
base_model_relation: adapter
---

<p align="center">
    <img src="./MotionEdit.png" width="500"/>
<p>
  
# MotionEdit: Benchmarking and Learning Motion-Centric Image Editing

[![MotionEdit](https://img.shields.io/badge/Arxiv-MotionEdit-b31b1b.svg?logo=arXiv)](https://motion-edit.github.io/)
[![GitHub Stars](https://img.shields.io/github/stars/elainew728/motion-edit?style=flat&logo=github&logoColor=whitesmoke)](https://github.com/elainew728/motion-edit/tree/main)
[![hf_dataset](https://img.shields.io/badge/🤗-HF_Dataset-red.svg)](https://huggingface.co/datasets/elaine1wan/MotionEdit-Bench)
[![Twitter](https://img.shields.io/badge/-Twitter@yixin_wan_-black?logo=twitter&logoColor=1D9BF0)](https://x.com/yixin_wan_?s=21&t=EqTxUZPAldbQnbhLN-CETA)
[![proj_page](https://img.shields.io/badge/Project_Page-ffcae2?style=flat-square)](https://motion-edit.github.io/) <br>


# ✨ Overview
**MotionEdit** is a novel dataset and benchmark for motion-centric image editing. We also propose **MotionNFT** (Motion-guided Negative-aware FineTuning), a post-training framework with motion alignment rewards to guide models on motion image editing task.

### Model Description
- **Model type:** Image Editing
- **Language(s):** English
- **Finetuned from model [optional]:** Qwen/Qwen-Image-Edit-2509

### Model Sources [optional]
- **Repository:** https://github.com/elainew728/motion-edit/tree/main
- **Paper:** https://arxiv.org/abs/2512.10284
- **Demo Page:** https://motion-edit.github.io/

# 🔧 Usage
## 🧱 To Start: Environment Setup
Clone our github repository and switch to the directory.

```
git clone https://github.com/elainew728/motion-edit.git
cd motion-edit
```

Create and activate the conda environment with dependencies that supports inference and training. 

> * **Note:** some models like UltraEdit requires specific dependencies on the diffusers library. Please refer to their official repository to resolve dependencies before running inference.

```
conda env create -f environment.yml
conda activate motionedit
```
Finally, configure your own huggingface token to access restricted models by modifying `YOUR_HF_TOKEN_HERE` in [inference/run_image_editing.py](https://github.com/elainew728/motion-edit/tree/main/inference/run_image_editing.py).


## 🔍 Inferencing on *MotionEdit-Bench* with Image Editing Models
We have released our [MotionEdit-Bench](https://huggingface.co/datasets/elaine1wan/MotionEdit-Bench) on Huggingface.
In this Github Repository, we provide code that supports easy inference across open-source Image Editing models: ***Qwen-Image-Edit***, ***Flux.1 Kontext [Dev]***, ***InstructPix2Pix***, ***HQ-Edit***, ***Step1X-Edit***, ***UltraEdit***, ***MagicBrush***, and ***AnyEdit***.


### Step 1: Data Preparation
The inference script default to using our [MotionEdit-Bench](https://huggingface.co/datasets/elaine1wan/MotionEdit-Bench), which will download the dataset from Huggingface. You can specify a `cache_dir` for storing the cached data.

Additionally, you can construct your own dataset for inference. Please organize all input images into a folder `INPUT_FOLDER` and create a `metadata.jsonl` in the same directory. The `metadata.jsonl` file **must** at least contain entries with 2 entries: 
```
{
    "file_name": IMAGE_NAME.EXT,
    "prompt": PROMPT
}
```

Then, load your dataset by:
```
from datasets import load_dataset
dataset = load_dataset("imagefolder", data_dir=INPUT_FOLDER)
```

### Step 2: Running Inference
Use the following command to run inference on **MotionEdit-Bench** with our ***MotionNFT*** Huggingface checkpoint, trained on **MotionEdit** with Qwen-Image-Edit as the base model:
```
python inference/run_image_editing.py \
    -o "./outputs/" \
    -m "motionedit" \
    --seed 42
```

<!-- ## Authors
[Yixin Wan](https://elainew728.github.io/)<sup>1,2</sup>, [Lei Ke](https://www.kelei.site/)<sup>1</sup>, [Wenhao Yu](https://wyu97.github.io/)<sup>1</sup>, [Kai-Wei Chang](https://web.cs.ucla.edu/~kwchang/)<sup>2</sup>, [Dong Yu](https://sites.google.com/view/dongyu888/)<sup>1</sup>

<sup>1</sup>Tencent AI, Seattle   &nbsp;  <sup>2</sup>University of California, Los Angeles
 -->

# ✏️ Citing

```bibtex
@misc{wan2025motioneditbenchmarkinglearningmotioncentric,
      title={MotionEdit: Benchmarking and Learning Motion-Centric Image Editing}, 
      author={Yixin Wan and Lei Ke and Wenhao Yu and Kai-Wei Chang and Dong Yu},
      year={2025},
      eprint={2512.10284},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.10284}, 
}
```