LiDAR-Perfect Depth (LPD)

Implementation of LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth on top of Pixel-Perfect Depth.

Repos

Repo Contents Size
chenming-wu/LiDAR-Perfect-Depth (this) code + 6 LPD-DA2 checkpoints + inference vis + extraction helper 12.5 GB
chenming-wu/LiDAR-Perfect-Depth-Datasets extracted eval sets + training-set archives + PPD/DA-V2/RAFT weights ~991 GB

What's in this repo

  • code/ β€” full LPD codebase. New modules under ppd/lpd/; updated configs and adapter loaders.
  • checkpoints/
    • e000-s001000.ckpt … e004-s005000.ckpt β€” per-epoch fine-tuned weights (DA2 backbone, 5K steps)
    • last.ckpt β€” same as e004
    • 2.0 GB each, weights-only
  • inference_vis/ β€” 8-sample qualitative panels (RGB | GT | PPD | LPD | LPD-variance)
  • extract_archives.sh β€” extracts the dataset archives back into the layout the code expects

Backbone options

PPD (and therefore LPD) support two semantic-prompt backbones β€” switching is a single config change.

Backbone Trainable params PPD ckpt Config
DA2 (Depth-Anything-V2 ViT-L) 16.3 M gangweix/Pixel-Perfect-Depth/ppd.pth code/ppd/configs/lpd_run5d_10k.yaml
MoGe2 16.3 M chenming-wu/LiDAR-Perfect-Depth-Datasets/pretrained/ppd_moge2/ code/ppd/configs/lpd_run5d_moge2.yaml

The official PPD MoGe2 release reports 20-30 % improvement over DA2 on zero-shot benchmarks; the LPD prompt branch is identical for either backbone.

Training run summary (DA2 backbone, 5K steps)

Backbone:   PPD-DA2 β€” 820 M, frozen
Trainable:  16 M (sparse-prompt encoder + gate)
Resolution: 1024 Γ— 768
Batch:      18  (~133 GB peak on a single H200)
Steps:      5,000 (5 epochs Γ— 1000 batches)
Mix:        Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
Init:       gangweix/Pixel-Perfect-Depth ppd.pth

epoch 0 β†’ 4   loss: 0.0186 β†’ 0.0177  (-4.8%)

The 5 K steps shown here is a partial-train demo β€” paper-scale training would push it much further.

For MoGe2 backbone we verified the full forward + backward + checkpointing pipeline works (lpd_run5d_moge2.yaml); training to convergence is left for a multi-GPU run.

Verification

cd code/
pip install -r requirements.txt
python -m ppd.lpd.tests.verify_paper      # 30 paper claims, all pass

PAPER_CHECKLIST.md maps each section of the paper to specific code files/lines.

Reproducing training

# Pretrained inputs the code expects
ln -sf <ppd.pth>                          code/checkpoints/ppd.pth                 # DA2
ln -sf <depth_anything_v2_vitl.pth>       code/checkpoints/depth_anything_v2_vitl.pth
# OR for MoGe2:
ln -sf <ppd_moge2.pth>                    code/checkpoints/ppd_moge2.pth
ln -sf <moge2.pt>                         code/checkpoints/moge2.pt

# Hypersim 512Β² pretrain (DA2)
bash code/train_lpd.sh

# 5-dataset 1024Γ—768 fine-tune
python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml      # DA2
# OR
python code/main.py --cfg_file code/ppd/configs/lpd_run5d_moge2.yaml    # MoGe2

# Inference comparison (PPD vs LPD)
python code/experiments/eval_lpd_vs_ppd.py

Datasets β€” fetching + extracting

Everything is hosted as archives in the dataset repo to keep file counts low and upload bandwidth high.

hf download chenming-wu/LiDAR-Perfect-Depth-Datasets --repo-type dataset \
    --local-dir /mnt/sig/datasets

bash code/extract_archives.sh /mnt/sig/datasets /mnt/sig/datasets/archives

Layout after extraction matches what the LPD configs reference (/mnt/sig/datasets/train/<scene>/..., /mnt/sig/datasets/eval_image/...).

Citations

  • Pixel-Perfect Depth β€” Xu et al., NeurIPS 2025
  • LiDAR-Perfect Depth β€” paper.tex in this repo
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support