| pipeline_tag: image-to-video | |
| library_name: diffusers | |
| # Learning to Refocus with Video Diffusion Models | |
| This repository contains the model weights for the paper [Learning to Refocus with Video Diffusion Models](https://huggingface.co/papers/2512.19823). | |
| [**Project Page**](https://learn2refocus.github.io/) | [**GitHub Repository**](https://github.com/tedlasai/learn2refocus) | |
| ## Summary | |
| Focus is a cornerstone of photography, yet autofocus systems often fail to capture the intended subject, and users frequently wish to adjust focus after capture. This work introduces a novel method for realistic post-capture refocusing using video diffusion models. From a single defocused image, the approach generates a perceptually accurate focal stack, represented as a video sequence, enabling interactive refocusing and unlocking a range of downstream applications. | |
| ## Usage | |
| For detailed environment setup, training, and testing instructions, please refer to the official [GitHub repository](https://github.com/tedlasai/learn2refocus). The model utilizes fine-tuned Stable Video Diffusion (SVD) weights. | |
| ## Citation | |
| If you use our dataset, code, or model in your research, please cite the following paper: | |
| ```bibtex | |
| @inproceedings{Tedla2025Refocus, | |
| title={{Learning to Refocus with Video Diffusion Models}}, | |
| author={{Tedla, SaiKiran and Zhang, Zhoutong and Zhang, Xuaner and Xin, Shumian}}, | |
| booktitle={{Proceedings of the ACM SIGGRAPH Asia Conference}}, | |
| year={{2025}} | |
| } | |
| ``` |