Image-to-Image
Diffusers
Safetensors
StableMotion / README.md
Ziyi Wang
Update README.md
c749571 verified
|
Raw
History Blame Contribute Delete
3.42 kB
metadata
base_model:
  - stabilityai/stable-diffusion-2
license: mit
library_name: diffusers
pipeline_tag: image-to-image

StableMotion: Repurposing Diffusion-Based Image Priors for Motion Estimation

This is the official repo for paper StableMotion: Repurposing Diffusion-Based Image Priors for Motion Estimation

Official Code Repository: GitHub - ivowang/StableMotion

Setup

  1. Clone the code repo.
  2. Create your environment from requirements.txt.
  3. Download DIR-D and RS-Real. Put them into StableMotion_SIR and StableMotion_RSC respectively.

StableMotion for Stitched Image Rectangling (SIR)

Inference

  1. Download the checkpoints of StableMotion_SIR
  2. Run cd StableMotion_SIR && sh sample.sh. You may want to change this file to modify the inference configurations.
  3. Run sh metrics.sh to evaluate the results.

Training

  1. Replicate RecDiffusion to get pseudo labels, i.e., the flow labels generated by RecDiffusion on the training set of DIR-D. Put and rename it into StableMotion_SIR/MDM_Flow.
  2. Run cd StableMotion_SIR && sh train.sh. You may want to change this file to modify the training configurations. The default configuration requires approximately 80 GB of VRAM per card.

StableMotion for Rolling Shutter Correction (RSC)

Inference

  1. Download the checkpoints of StableMotion_RSC
  2. Run cd StableMotion_RSC && sh sample.sh. You may want to change this file to modify the inference configurations.
  3. Run sh metrics.sh to evaluate the results.

Training

Run cd StableMotion_RSC && sh train.sh. You may want to change this file to modify the training configurations. The default configuration requires approximately 40 GB of VRAM per card.

GPT Rule-Based Evaluation

Each task folder has a gpt_eval subfolder with the script used in the paper to score results with a vision LLM (GPT) on a fixed rubric. StableMotion_SIR/gpt_eval/score_rectangle.py scores Stitched Image Rectangling (SIR) (input, output) pairs, and StableMotion_RSC/gpt_eval/score_rolling_shutter.py scores Rolling Shutter Correction (RSC) [input | Yang | Ours] triptychs. Both call an OpenAI-Responses-compatible API and emit per-pair scores plus an aggregate summary.json (mean/std/95% CI). To run, copy provider.example.json to provider.json in the relevant folder, add your endpoint/key, then e.g. cd StableMotion_SIR/gpt_eval && python score_rectangle.py <input_dir> <result_dir>. See each folder's README.md for the full rubric, flags, and outputs.

Citation

@article{wang2025stablemotion,
  title={StableMotion: One-Step Motion Estimation with Diffusion Prior},
  author={Wang, Ziyi and Li, Haipeng and Sui, Lin and Zhou, Tianhao and Jiang, Hai and Nie, Lang and Liu, Shuaicheng},
  journal={arXiv preprint arXiv:2505.06668},
  year={2025}
}