File size: 1,488 Bytes
3c01afe bed822b 3c01afe bed822b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
---
pipeline_tag: image-to-video
library_name: diffusers
---
# Learning to Refocus with Video Diffusion Models
This repository contains the model weights for the paper [Learning to Refocus with Video Diffusion Models](https://huggingface.co/papers/2512.19823).
[**Project Page**](https://learn2refocus.github.io/) | [**GitHub Repository**](https://github.com/tedlasai/learn2refocus)
## Summary
Focus is a cornerstone of photography, yet autofocus systems often fail to capture the intended subject, and users frequently wish to adjust focus after capture. This work introduces a novel method for realistic post-capture refocusing using video diffusion models. From a single defocused image, the approach generates a perceptually accurate focal stack, represented as a video sequence, enabling interactive refocusing and unlocking a range of downstream applications.
## Usage
For detailed environment setup, training, and testing instructions, please refer to the official [GitHub repository](https://github.com/tedlasai/learn2refocus). The model utilizes fine-tuned Stable Video Diffusion (SVD) weights.
## Citation
If you use our dataset, code, or model in your research, please cite the following paper:
```bibtex
@inproceedings{Tedla2025Refocus,
title={{Learning to Refocus with Video Diffusion Models}},
author={{Tedla, SaiKiran and Zhang, Zhoutong and Zhang, Xuaner and Xin, Shumian}},
booktitle={{Proceedings of the ACM SIGGRAPH Asia Conference}},
year={{2025}}
}
``` |