Update model card with pipeline tag, links and citation

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +45 -26
README.md CHANGED
@@ -1,27 +1,46 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
- # Model Card for DKT
6
-
7
- This repository contains the weights of `Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation`
8
-
9
- ## Usage
10
-
11
- See the Github repository: [DKT](https://github.com/Daniellli/DKT) regarding installation instructions.
12
-
13
- The model can then be used as follows:
14
-
15
- ```python
16
- from dkt.pipelines.pipelines import DKTPipeline
17
- import os
18
- from tools.common_utils import save_video
19
-
20
- pipe = DKTPipeline()
21
- demo_path = 'examples/1.mp4'
22
- prediction = pipe(demo_path)
23
- save_dir = 'logs'
24
- os.makedirs(save_dir, exist_ok=True)
25
- output_path = os.path.join(save_dir, 'demo.mp4')
26
- save_video(prediction['colored_depth_map'], output_path, fps=25)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ```
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: image-to-image
4
+ ---
5
+
6
+ # Diffusion Knows Transparency (DKT)
7
+
8
+ This repository contains the weights for **DKT** (Diffusion Knows Transparency), a foundation model for transparent-object, in-the-wild, and arbitrary-length video depth and normal estimation.
9
+
10
+ [**Project Page**](https://daniellli.github.io/projects/DKT/) | [**GitHub**](https://github.com/Daniellli/DKT) | [**Paper**](https://huggingface.co/papers/2512.23705)
11
+
12
+ ## Introduction
13
+ DKT repurposes generative video priors from large-scale diffusion models into robust, temporally coherent perception tasks. By learning a video-to-video translator for depth and normals via lightweight LoRA adapters, it achieves zero-shot SOTA results on benchmarks involving transparency, such as ClearPose and DREDS.
14
+
15
+ ## Usage
16
+
17
+ To use this model, please clone the [GitHub repository](https://github.com/Daniellli/DKT) and follow the installation instructions.
18
+
19
+ ```python
20
+ from dkt.pipelines.pipelines import DKTPipeline
21
+ import os
22
+ from tools.common_utils import save_video
23
+
24
+ # Initialize the pipeline
25
+ pipe = DKTPipeline()
26
+
27
+ # Path to your input video
28
+ demo_path = 'examples/1.mp4'
29
+ prediction = pipe(demo_path)
30
+
31
+ # Save the output
32
+ save_dir = 'logs'
33
+ os.makedirs(save_dir, exist_ok=True)
34
+ output_path = os.path.join(save_dir, 'demo.mp4')
35
+ save_video(prediction['colored_depth_map'], output_path, fps=25)
36
+ ```
37
+
38
+ ## Citation
39
+ ```bibtex
40
+ @article{dkt2025,
41
+ title = {Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation},
42
+ author = {Shaocong Xu and Songlin Wei and Qizhe Wei and Zheng Geng and Hong Li and Licheng Shen and Qianpu Sun and Shu Han and Bin Ma and Bohan Li and Chongjie Ye and Yuhang Zheng and Nan Wang and Saining Zhang and Hao Zhao},
43
+ journal = {arXiv preprint arXiv:2512.23705},
44
+ year = {2025}
45
+ }
46
  ```