ziyiwang
/

StableMotion

Model card Files Files and versions

StableMotion / README.md

Ziyi Wang

Update README.md

c749571 verified 4 days ago

|

History Blame Contribute Delete

3.42 kB

	---
	base_model:
	- stabilityai/stable-diffusion-2
	license: mit
	library_name: diffusers
	pipeline_tag: image-to-image
	---

	# StableMotion: Repurposing Diffusion-Based Image Priors for Motion Estimation
	This is the official repo for paper [StableMotion: Repurposing Diffusion-Based Image Priors for Motion Estimation](https://www.arxiv.org/abs/2505.06668)

	Official Code Repository: [GitHub - ivowang/StableMotion](https://github.com/ivowang/StableMotion)

	## Setup
	0. Clone the [code repo](https://github.com/ivowang/StableMotion).
	1. Create your environment from `requirements.txt`.
	2. Download [DIR-D](https://drive.google.com/file/d/1KR5DtekPJin3bmQPlTGP4wbM1zFR80ak/view?usp=sharing) and [RS-Real](https://huggingface.co/datasets/Yzl-code/RS-Diffusion/tree/main). Put them into `StableMotion_SIR` and `StableMotion_RSC` respectively.

	## StableMotion for Stitched Image Rectangling (SIR)
	### Inference
	0. Download the checkpoints of [StableMotion_SIR](https://huggingface.co/ivowang/StableMotion/tree/main/StableMotion_SIR)
	1. Run `cd StableMotion_SIR && sh sample.sh`. You may want to change this file to modify the inference configurations.
	2. Run `sh metrics.sh` to evaluate the results.

	### Training
	0. Replicate [RecDiffusion](https://github.com/lhaippp/RecDiffusion) to get pseudo labels, i.e., the flow labels generated by [RecDiffusion](https://github.com/lhaippp/RecDiffusion) on the training set of [DIR-D](https://drive.google.com/file/d/1KR5DtekPJin3bmQPlTGP4wbM1zFR80ak/view?usp=sharing). Put and rename it into `StableMotion_SIR/MDM_Flow`.
	1. Run `cd StableMotion_SIR && sh train.sh`. You may want to change this file to modify the training configurations. The default configuration requires approximately 80 GB of VRAM per card.

	## StableMotion for Rolling Shutter Correction (RSC)
	### Inference
	0. Download the checkpoints of [StableMotion_RSC](https://huggingface.co/ivowang/StableMotion/tree/main/StableMotion_RSC)
	1. Run `cd StableMotion_RSC && sh sample.sh`. You may want to change this file to modify the inference configurations.
	2. Run `sh metrics.sh` to evaluate the results.

	### Training
	Run `cd StableMotion_RSC && sh train.sh`. You may want to change this file to modify the training configurations. The default configuration requires approximately 40 GB of VRAM per card.

	## GPT Rule-Based Evaluation
	Each task folder has a `gpt_eval` subfolder with the script used in the paper to score results with a vision LLM (GPT) on a fixed rubric. `StableMotion_SIR/gpt_eval/score_rectangle.py` scores Stitched Image Rectangling (SIR) `(input, output)` pairs, and `StableMotion_RSC/gpt_eval/score_rolling_shutter.py` scores Rolling Shutter Correction (RSC) `[input \| Yang \| Ours]` triptychs. Both call an OpenAI-Responses-compatible API and emit per-pair scores plus an aggregate `summary.json` (mean/std/95% CI). To run, copy `provider.example.json` to `provider.json` in the relevant folder, add your endpoint/key, then e.g. `cd StableMotion_SIR/gpt_eval && python score_rectangle.py <input_dir> <result_dir>`. See each folder's `README.md` for the full rubric, flags, and outputs.

	## Citation

	```bibtex
	@article{wang2025stablemotion,
	title={StableMotion: One-Step Motion Estimation with Diffusion Prior},
	author={Wang, Ziyi and Li, Haipeng and Sui, Lin and Zhou, Tianhao and Jiang, Hai and Nie, Lang and Liu, Shuaicheng},
	journal={arXiv preprint arXiv:2505.06668},
	year={2025}
	}
	```