LiDAR-Perfect Depth (LPD)
Implementation of LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth on top of Pixel-Perfect Depth.
Repos
| Repo | Contents | Size |
|---|---|---|
chenming-wu/LiDAR-Perfect-Depth (this) |
code + 6 LPD-DA2 checkpoints + inference vis + extraction helper | 12.5 GB |
chenming-wu/LiDAR-Perfect-Depth-Datasets |
extracted eval sets + training-set archives + PPD/DA-V2/RAFT weights | ~991 GB |
What's in this repo
code/β full LPD codebase. New modules underppd/lpd/; updated configs and adapter loaders.checkpoints/e000-s001000.ckptβ¦e004-s005000.ckptβ per-epoch fine-tuned weights (DA2 backbone, 5K steps)last.ckptβ same as e004- 2.0 GB each, weights-only
inference_vis/β 8-sample qualitative panels (RGB | GT | PPD | LPD | LPD-variance)extract_archives.shβ extracts the dataset archives back into the layout the code expects
Backbone options
PPD (and therefore LPD) support two semantic-prompt backbones β switching is a single config change.
| Backbone | Trainable params | PPD ckpt | Config |
|---|---|---|---|
| DA2 (Depth-Anything-V2 ViT-L) | 16.3 M | gangweix/Pixel-Perfect-Depth/ppd.pth |
code/ppd/configs/lpd_run5d_10k.yaml |
| MoGe2 | 16.3 M | chenming-wu/LiDAR-Perfect-Depth-Datasets/pretrained/ppd_moge2/ |
code/ppd/configs/lpd_run5d_moge2.yaml |
The official PPD MoGe2 release reports 20-30 % improvement over DA2 on zero-shot benchmarks; the LPD prompt branch is identical for either backbone.
Training run summary (DA2 backbone, 5K steps)
Backbone: PPD-DA2 β 820 M, frozen
Trainable: 16 M (sparse-prompt encoder + gate)
Resolution: 1024 Γ 768
Batch: 18 (~133 GB peak on a single H200)
Steps: 5,000 (5 epochs Γ 1000 batches)
Mix: Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
Init: gangweix/Pixel-Perfect-Depth ppd.pth
epoch 0 β 4 loss: 0.0186 β 0.0177 (-4.8%)
The 5 K steps shown here is a partial-train demo β paper-scale training would push it much further.
For MoGe2 backbone we verified the full forward + backward + checkpointing pipeline works
(lpd_run5d_moge2.yaml); training to convergence is left for a multi-GPU run.
Verification
cd code/
pip install -r requirements.txt
python -m ppd.lpd.tests.verify_paper # 30 paper claims, all pass
PAPER_CHECKLIST.md maps each section of the paper to specific code files/lines.
Reproducing training
# Pretrained inputs the code expects
ln -sf <ppd.pth> code/checkpoints/ppd.pth # DA2
ln -sf <depth_anything_v2_vitl.pth> code/checkpoints/depth_anything_v2_vitl.pth
# OR for MoGe2:
ln -sf <ppd_moge2.pth> code/checkpoints/ppd_moge2.pth
ln -sf <moge2.pt> code/checkpoints/moge2.pt
# Hypersim 512Β² pretrain (DA2)
bash code/train_lpd.sh
# 5-dataset 1024Γ768 fine-tune
python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml # DA2
# OR
python code/main.py --cfg_file code/ppd/configs/lpd_run5d_moge2.yaml # MoGe2
# Inference comparison (PPD vs LPD)
python code/experiments/eval_lpd_vs_ppd.py
Datasets β fetching + extracting
Everything is hosted as archives in the dataset repo to keep file counts low and upload bandwidth high.
hf download chenming-wu/LiDAR-Perfect-Depth-Datasets --repo-type dataset \
--local-dir /mnt/sig/datasets
bash code/extract_archives.sh /mnt/sig/datasets /mnt/sig/datasets/archives
Layout after extraction matches what the LPD configs reference (/mnt/sig/datasets/train/<scene>/..., /mnt/sig/datasets/eval_image/...).
Citations
- Pixel-Perfect Depth β Xu et al., NeurIPS 2025
- LiDAR-Perfect Depth β
paper.texin this repo