File size: 2,465 Bytes
436b829
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
license: mit
tags:
- depth-estimation
- diffusion
- monocular-depth
- lidar-prompted
- pixel-perfect-depth
- pytorch
---

# LiDAR-Perfect Depth (LPD)

Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth* on top of the public Pixel-Perfect Depth (PPD) codebase.

## What's in this repo

- `code/` β€” full repo (LPD additions live under `ppd/lpd/`, plus updated configs and adapter data loaders).
- `checkpoints/`
  - `e000-s001000.ckpt` … `e004-s005000.ckpt` β€” per-epoch checkpoints from the 5-dataset 1024Γ—768 fine-tune
  - `last.ckpt` β€” rolling latest (= e004 here)
  - All 2.0 GB each, weights-only.
- `inference_vis/` β€” 8-sample qualitative comparisons (RGB | GT | PPD | LPD | LPD-variance) generated with `experiments/eval_lpd_vs_ppd.py`.

## Training run summary

```
Backbone:      PPD (DA-V2 semantics) β€” 820 M params, frozen
Trainable:     16 M (sparse_prompt_encoder + prompt_gate)
Resolution:    1024 Γ— 768
Batch size:    18 (~119 GB GPU peak on H200)
Steps:         5,000 (5 epochs Γ— 1000 batches)
Mix ratios:    Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
Init:          official PPD checkpoint (gangweix/Pixel-Perfect-Depth)

Epoch loss:    e0=0.0186 β†’ e4=0.0177  (-4.8% over 5 epochs)
```

The official paper trains for many more steps; this checkpoint is a partial-train demo to show the pipeline works end-to-end.

## Verification suite

```bash
cd code/
python -m ppd.lpd.tests.verify_paper
# 30 paper claims tested β€” all should pass.
```

Maps every section of `paper.tex` to a code line. See `PAPER_CHECKLIST.md` for the per-claim table.

## Reproduction

```bash
pip install -r code/requirements.txt
# Symlink official weights
ln -sf <PPD ckpt>             code/checkpoints/ppd.pth
ln -sf <DA-V2 ViT-L ckpt>     code/checkpoints/depth_anything_v2_vitl.pth

# Stage 1 β€” Hypersim 512Β² pretrain
bash code/train_lpd.sh

# Stage 2 β€” 5-dataset 1024Γ—768 fine-tune (this run's recipe)
python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml

# Inference comparison
python code/experiments/eval_lpd_vs_ppd.py
```

## Datasets

Used the LFS dataset companion: [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets).

## Citations

- Pixel-Perfect Depth β€” Xu et al., NeurIPS 2025
- LiDAR-Perfect Depth β€” paper.tex (anonymous submission)