nielsr HF Staff commited on
Commit
3189c45
·
verified ·
1 Parent(s): 3c01afe

Add model card metadata and links for Learning to Refocus with Video Diffusion Models

Browse files

This PR improves the model card for "Learning to Refocus with Video Diffusion Models" by adding crucial metadata and enhancing its documentation.

Key changes include:
- Adding `pipeline_tag: image-to-video` to correctly categorize the model on the Hub, reflecting its capability to generate video sequences from images.
- Adding `library_name: diffusers`, as evidenced by `_diffusers_version` in the `config.json` files, to enable automated usage snippets.
- Including direct links to the project page and the GitHub repository for easy access to further information and code.
- Providing a concise description of the model's purpose, based on the paper's abstract.
- Adding a citation section with the BibTeX entry from the project's GitHub README.

These updates will make the model more discoverable and user-friendly on the Hugging Face Hub.

Files changed (1) hide show
  1. README.md +26 -4
README.md CHANGED
@@ -1,6 +1,28 @@
1
  ---
2
- # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
3
- # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
- {}
5
  ---
6
- https://huggingface.co/papers/2512.19823
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: image-to-video
3
+ library_name: diffusers
 
4
  ---
5
+
6
+ # Learning to Refocus with Video Diffusion Models
7
+
8
+ This repository contains the model weights for the paper [Learning to Refocus with Video Diffusion Models](https://huggingface.co/papers/2512.19823).
9
+
10
+ [**Project Page**](https://learn2refocus.github.io/) | [**GitHub Repository**](https://github.com/tedlasai/learn2refocus)
11
+
12
+ ## Summary
13
+ Focus is a cornerstone of photography, yet autofocus systems often fail to capture the intended subject, and users frequently wish to adjust focus after capture. This work introduces a novel method for realistic post-capture refocusing using video diffusion models. From a single defocused image, the approach generates a perceptually accurate focal stack, represented as a video sequence, enabling interactive refocusing and unlocking a range of downstream applications.
14
+
15
+ ## Usage
16
+ For detailed environment setup, training, and testing instructions, please refer to the official [GitHub repository](https://github.com/tedlasai/learn2refocus). The model utilizes fine-tuned Stable Video Diffusion (SVD) weights.
17
+
18
+ ## Citation
19
+ If you use our dataset, code, or model in your research, please cite the following paper:
20
+
21
+ ```bibtex
22
+ @inproceedings{Tedla2025Refocus,
23
+ title={{Learning to Refocus with Video Diffusion Models}},
24
+ author={{Tedla, SaiKiran and Zhang, Zhoutong and Zhang, Xuaner and Xin, Shumian}},
25
+ booktitle={{Proceedings of the ACM SIGGRAPH Asia Conference}},
26
+ year={{2025}}
27
+ }
28
+ ```