Spaces:
Sleeping
Sleeping
| title: DepthCrafter | |
| app_file: app.py | |
| sdk: gradio | |
| sdk_version: 6.0.2 | |
| ## ___***DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos***___ | |
| <div align="center"> | |
| <img src='https://depthcrafter.github.io/img/logo.png' style="height:140px"></img> | |
|  | |
| <a href='https://arxiv.org/abs/2409.02095'><img src='https://img.shields.io/badge/arXiv-2409.02095-b31b1b.svg'></a> | |
| <a href='https://depthcrafter.github.io'><img src='https://img.shields.io/badge/Project-Page-Green'></a> | |
| <a href='https://huggingface.co/spaces/tencent/DepthCrafter'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Demo-blue'></a> | |
| _**[Wenbo Hu<sup>1* †</sup>](https://wbhu.github.io), | |
| [Xiangjun Gao<sup>2*</sup>](https://scholar.google.com/citations?user=qgdesEcAAAAJ&hl=en), | |
| [Xiaoyu Li<sup>1* †</sup>](https://xiaoyu258.github.io), | |
| [Sijie Zhao<sup>1</sup>](https://scholar.google.com/citations?user=tZ3dS3MAAAAJ&hl=en), | |
| [Xiaodong Cun<sup>1</sup>](https://vinthony.github.io/academic), <br> | |
| [Yong Zhang<sup>1</sup>](https://yzhang2016.github.io), | |
| [Long Quan<sup>2</sup>](https://home.cse.ust.hk/~quan), | |
| [Ying Shan<sup>3, 1</sup>](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en)**_ | |
| <br><br> | |
| <sup>1</sup>Tencent AI Lab | |
| <sup>2</sup>The Hong Kong University of Science and Technology | |
| <sup>3</sup>ARC Lab, Tencent PCG | |
| CVPR 2025οΌ **Highlight** | |
| </div> | |
| ## π Notice | |
| **DepthCrafter is still under active development!** | |
| We recommend that everyone use English to communicate on issues, as this helps developers from around the world discuss, share experiences, and answer questions together. | |
| For business licensing and other related inquiries, don't hesitate to contact `wbhu@tencent.com`. | |
| ## π Introduction | |
| π€ If you find DepthCrafter useful, **please help β this repo**, which is important to Open-Source projects. Thanks! | |
| π₯ DepthCrafter can generate temporally consistent long-depth sequences with fine-grained details for open-world videos, | |
| without requiring additional information such as camera poses or optical flow. | |
| - `[25-12-01]` Refactored the codebase for better usability and extensibility. | |
| - `[25-04-05]` π₯π₯π₯ Its upgraded work, [GeometryCrafter](https://github.com/TencentARC/GeometryCrafter), is released now, for **video to point cloud**! | |
| - `[25-04-05]` πππ DepthCrafter is selected as **Highlight** in CVPRβ25. | |
| - `[24-12-10]` πππ EXR output format is supported now, with --save_exr option. | |
| - `[24-11-26]` πππ DepthCrafter v1.0.1 is released now, with improved quality and speed | |
| - `[24-10-19]` π€π€π€ DepthCrafter now has been integrated into [ComfyUI](https://github.com/akatz-ai/ComfyUI-DepthCrafter-Nodes)! | |
| - `[24-10-08]` π€π€π€ DepthCrafter now has been integrated into [Nuke](https://github.com/Theo-SAMINADIN-td/NukeDepthCrafter), have a try! | |
| - `[24-09-28]` Add full dataset inference and evaluation scripts for better comparison use. :-) | |
| - `[24-09-25]` π€π€π€ Add huggingface online demo [DepthCrafter](https://huggingface.co/spaces/tencent/DepthCrafter). | |
| - `[24-09-19]` Add scripts for preparing benchmark datasets. | |
| - `[24-09-18]` Add point cloud sequence visualization. | |
| - `[24-09-14]` π₯π₯π₯ **DepthCrafter** is released now, have fun! | |
| ## π¦ Release Notes | |
| - **DepthCrafter v1.0.1**: | |
| - Quality and speed improvement | |
| <table> | |
| <thead> | |
| <tr> | |
| <th>Method</th> | |
| <th>ms/frame↓ @1024×576 </th> | |
| <th colspan="2">Sintel (~50 frames)</th> | |
| <th colspan="2">Scannet (90 frames)</th> | |
| <th colspan="2">KITTI (110 frames)</th> | |
| <th colspan="2">Bonn (110 frames)</th> | |
| </tr> | |
| <tr> | |
| <th></th> | |
| <th></th> | |
| <th>AbsRel↓</th> | |
| <th>δ₁ ↑</th> | |
| <th>AbsRel↓</th> | |
| <th>δ₁ ↑</th> | |
| <th>AbsRel↓</th> | |
| <th>δ₁ ↑</th> | |
| <th>AbsRel↓</th> | |
| <th>δ₁ ↑</th> | |
| </tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td>Marigold</td> | |
| <td>1070.29</td> | |
| <td>0.532</td> | |
| <td>0.515</td> | |
| <td>0.166</td> | |
| <td>0.769</td> | |
| <td>0.149</td> | |
| <td>0.796</td> | |
| <td>0.091</td> | |
| <td>0.931</td> | |
| </tr> | |
| <tr> | |
| <td>Depth-Anything-V2</td> | |
| <td><strong>180.46</strong></td> | |
| <td>0.367</td> | |
| <td>0.554</td> | |
| <td>0.135</td> | |
| <td>0.822</td> | |
| <td>0.140</td> | |
| <td>0.804</td> | |
| <td>0.106</td> | |
| <td>0.921</td> | |
| </tr> | |
| <tr> | |
| <td>DepthCrafter previous</td> | |
| <td>1913.92</td> | |
| <td><u>0.292</u></td> | |
| <td><strong>0.697</strong></td> | |
| <td><u>0.125</u></td> | |
| <td><u>0.848</u></td> | |
| <td><u>0.110</u></td> | |
| <td><u>0.881</u></td> | |
| <td><u>0.075</u></td> | |
| <td><u>0.971</u></td> | |
| </tr> | |
| <tr> | |
| <td>DepthCrafter v1.0.1</td> | |
| <td><u>465.84</u></td> | |
| <td><strong>0.270</strong></td> | |
| <td><strong>0.697</strong></td> | |
| <td><strong>0.123</strong></td> | |
| <td><strong>0.856</strong></td> | |
| <td><strong>0.104</strong></td> | |
| <td><strong>0.896</strong></td> | |
| <td><strong>0.071</strong></td> | |
| <td><strong>0.972</strong></td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| ## π₯ Visualization | |
| We provide demos of unprojected point cloud sequences, with reference RGB and estimated depth videos. | |
| For more details, please refer to our [project page](https://depthcrafter.github.io). | |
| https://github.com/user-attachments/assets/62141cc8-04d0-458f-9558-fe50bc04cc21 | |
| ## π Quick Start | |
| ### π€ Gradio Demo | |
| - Online demo: [DepthCrafter](https://huggingface.co/spaces/tencent/DepthCrafter) | |
| - Local demo: | |
| ```bash | |
| gradio app.py | |
| ``` | |
| ### π Community Support | |
| - [NukeDepthCrafter](https://github.com/Theo-SAMINADIN-td/NukeDepthCrafter): | |
| a plugin allows you to generate temporally consistent Depth sequences inside Nuke, | |
| which is widely used in the VFX industry. | |
| - [ComfyUI-Nodes](https://github.com/akatz-ai/ComfyUI-DepthCrafter-Nodes): creating consistent depth maps for your videos using DepthCrafter in ComfyUI. | |
| ### π οΈ Installation | |
| 1. Clone this repo: | |
| ```bash | |
| git clone https://github.com/Tencent/DepthCrafter.git | |
| ``` | |
| 2. Install dependencies: | |
| ```bash | |
| cd DepthCrafter | |
| uv venv | |
| source .venv/bin/activate | |
| uv sync | |
| uv pip list | |
| ``` | |
| ### π€ Model Zoo | |
| [DepthCrafter](https://huggingface.co/tencent/DepthCrafter) is available in the Hugging Face Model Hub. | |
| ### πββοΈ Inference | |
| #### 1. High-resolution inference, requires a GPU with ~26GB memory for 1024x576 resolution: | |
| - ~2.1 fps on A100, recommended for high-quality results: | |
| ```bash | |
| python run.py --video-path examples/example_01.mp4 | |
| ``` | |
| #### 2. Low-resolution inference requires a GPU with ~9GB memory for 512x256 resolution: | |
| - ~8.6 fps on A100: | |
| ```bash | |
| python run.py --video-path examples/example_01.mp4 --max-res 512 | |
| ``` | |
| ## π Dataset Evaluation | |
| Please check the `benchmark` folder. | |
| - To create the dataset we use in the paper, you need to run `dataset_extract/dataset_extract_${dataset_name}.py`. | |
| - Then you will get the `csv` files that save the relative root of extracted RGB video and depth npz files. We also provide these csv files. | |
| - Inference for all datasets scripts: | |
| ```bash | |
| bash benchmark/infer/infer.sh | |
| ``` | |
| (Remember to replace the `input_rgb_root` and `saved_root` with your path.) | |
| - Evaluation for all datasets scripts: | |
| ```bash | |
| bash benchmark/eval/eval.sh | |
| ``` | |
| (Remember to replace the `pred_disp_root` and `gt_disp_root` with your wpath.) | |
| #### | |
| ## π€π» Contributing | |
| - Welcome to open issues and pull requests. | |
| - Welcome to optimize the inference speed and memory usage, e.g., through model quantization, distillation, or other acceleration techniques. | |
| ### Contributors | |
| <a href="https://github.com/Tencent/DepthCrafter/graphs/contributors"> | |
| <img src="https://contrib.rocks/image?repo=Tencent/DepthCrafter" /> | |
| </a> | |
| ## π§ͺ Testing | |
| We provide comprehensive unit tests to ensure code quality and reliability. | |
| ### Running Tests | |
| 1. **Run all tests**: | |
| ```bash | |
| pytest unit_tests/ | |
| ``` | |
| 2. **Run tests with verbose output**: | |
| ```bash | |
| pytest unit_tests/ -v | |
| ``` | |
| 3. **Run specific test file**: | |
| ```bash | |
| pytest unit_tests/test_depth_crafter_ppl.py | |
| ``` | |
| ### Test Structure | |
| - `unit_tests/test_depth_crafter_ppl.py`: Tests for the main depth estimation pipeline | |
| - `unit_tests/test_inference.py`: Tests for the inference interface | |
| - `unit_tests/test_utils.py`: Tests for utility functions | |
| - `unit_tests/test_unet.py`: Tests for the UNet model | |
| ### Requirements | |
| - GPU with CUDA support is required for `test_pipeline_gpu_integration` | |
| - Tests use small tensor sizes to minimize memory usage | |
| - All heavy computations are mocked for fast execution | |
| ## Star History | |
| [](https://star-history.com/#Tencent/DepthCrafter&Date) | |
| ## π Citation | |
| If you find this work helpful, please consider citing: | |
| ```BibTeXw | |
| @inproceedings{hu2025-DepthCrafter, | |
| author = {Hu, Wenbo and Gao, Xiangjun and Li, Xiaoyu and Zhao, Sijie and Cun, Xiaodong and Zhang, Yong and Quan, Long and Shan, Ying}, | |
| title = {DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos}, | |
| booktitle = {CVPR}, | |
| year = {2025} | |
| } | |
| ``` | |