Papers
arxiv:2506.12853

EraserDiT: Fast Video Inpainting with Diffusion Transformer Model

Published on Jun 15, 2025
Authors:
,

Abstract

A novel video inpainting method using Diffusion Transformer architecture that improves long-term temporal consistency and handles large masked areas more effectively than traditional flow-based approaches.

AI-generated summary

Video object removal and inpainting are critical tasks in the fields of computer vision and multimedia processing, aimed at restoring missing or corrupted regions in video sequences. Traditional methods predominantly rely on flow-based propagation and spatio-temporal Transformers, but these approaches face limitations in effectively leveraging long-term temporal features and ensuring temporal consistency in the completion results, particularly when dealing with large masks. Consequently, performance on extensive masked areas remains suboptimal. To address these challenges, this paper introduces a novel video inpainting approach leveraging the Diffusion Transformer (DiT). DiT synergistically combines the advantages of diffusion models and transformer architectures to maintain long-term temporal consistency while ensuring high-quality inpainting results. We propose a Circular Position-Shift strategy to further enhance long-term temporal consistency during the inference stage. Additionally, the proposed method interactively removes specified objects, and generates corresponding prompts. In terms of processing speed, it takes only 65 seconds (testing on one NVIDIA H800 GPU) to complete a video with a resolution of 2160 times 2100 with 97 frames without any acceleration method. Experimental results indicate that the proposed method demonstrates superior performance in content fidelity, texture restoration, and temporal consistency. Project page:https://jieliu95.github.io/EraserDiT_demo/

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.12853 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2506.12853 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.