chenming-wu
/

LiDAR-Perfect-Depth

@@ -11,66 +11,84 @@ tags:
 # LiDAR-Perfect Depth (LPD)
-Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth* on top of the public Pixel-Perfect Depth (PPD) codebase.
 ## What's in this repo
-- `code/` — full repo (LPD additions live under `ppd/lpd/`, plus updated configs and adapter data loaders).
 - `checkpoints/`
-  - `e000-s001000.ckpt` … `e004-s005000.ckpt` — per-epoch checkpoints from the 5-dataset 1024×768 fine-tune
-  - `last.ckpt` — rolling latest (= e004 here)
-  - All 2.0 GB each, weights-only.
-- `inference_vis/` — 8-sample qualitative comparisons (RGB | GT | PPD | LPD | LPD-variance) generated with `experiments/eval_lpd_vs_ppd.py`.
 ## Training run summary
 ```
-Backbone:      PPD (DA-V2 semantics) — 820 M params, frozen
-Trainable:     16 M (sparse_prompt_encoder + prompt_gate)
-Resolution:    1024 × 768
-Batch size:    18 (~119 GB GPU peak on H200)
-Steps:         5,000 (5 epochs × 1000 batches)
-Mix ratios:    Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
-Init:          official PPD checkpoint (gangweix/Pixel-Perfect-Depth)
-Epoch loss:    e0=0.0186 → e4=0.0177  (-4.8% over 5 epochs)
 ```
-The official paper trains for many more steps; this checkpoint is a partial-train demo to show the pipeline works end-to-end.
-## Verification suite
 ```bash
 cd code/
-python -m ppd.lpd.tests.verify_paper
-# 30 paper claims tested — all should pass.
 ```
-Maps every section of `paper.tex` to a code line. See `PAPER_CHECKLIST.md` for the per-claim table.
-## Reproduction
 ```bash
-pip install -r code/requirements.txt
-# Symlink official weights
-ln -sf <PPD ckpt>             code/checkpoints/ppd.pth
-ln -sf <DA-V2 ViT-L ckpt>     code/checkpoints/depth_anything_v2_vitl.pth
-# Stage 1 — Hypersim 512² pretrain
 bash code/train_lpd.sh
-# Stage 2 — 5-dataset 1024×768 fine-tune (this run's recipe)
 python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml
-# Inference comparison
 python code/experiments/eval_lpd_vs_ppd.py
 ```
-## Datasets
-Used the LFS dataset companion: [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets).
 ## Citations
-- Pixel-Perfect Depth — Xu et al., NeurIPS 2025
-- LiDAR-Perfect Depth — paper.tex (anonymous submission)

 # LiDAR-Perfect Depth (LPD)
+Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth* on top of [Pixel-Perfect Depth](https://github.com/gangweix/pixel-perfect-depth).
+## Repos
+| Repo | Contents | Size |
+|---|---|---|
+| `chenming-wu/LiDAR-Perfect-Depth` (this) | code + 6 LPD checkpoints + inference vis + extraction helper | 12.5 GB |
+| [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets) | extracted eval sets + training-set archives + PPD/DA-V2/RAFT weights | ~993 GB |
 ## What's in this repo
+- `code/` — full LPD codebase. New modules under `ppd/lpd/`; updated configs and adapter loaders.
 - `checkpoints/`
+  - `e000-s001000.ckpt` … `e004-s005000.ckpt` — per-epoch fine-tuned weights
+  - `last.ckpt` — same as e004
+  - 2.0 GB each, weights-only; load with `pipeline.dit.load_state_dict(strip_prefix("pipeline.", ckpt["state_dict"]), strict=False)`
+- `inference_vis/` — 8-sample qualitative panels (RGB | GT | PPD | LPD | LPD-variance)
+- `extract_archives.sh` — extracts the dataset archives back into the layout the code expects
 ## Training run summary
 ```
+Backbone:   PPD (DA-V2 ViT-L semantics) — 820 M, frozen
+Trainable:  16 M (sparse-prompt encoder + gate)
+Resolution: 1024 × 768
+Batch:      18  (~133 GB peak on a single H200)
+Steps:      5,000 (5 epochs × 1000 batches)
+Mix:        Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
+Init:       gangweix/Pixel-Perfect-Depth ppd.pth
+epoch 0 → 4   loss: 0.0186 → 0.0177  (-4.8%)
 ```
+The 5 K steps shown here is a partial-train demo — paper-scale training would push it much further.
+## Verification
 ```bash
 cd code/
+pip install -r requirements.txt
+python -m ppd.lpd.tests.verify_paper      # 30 paper claims, all pass
 ```
+`PAPER_CHECKLIST.md` maps each section of the paper to specific code files/lines.
+## Reproducing training
 ```bash
+# Pretrained inputs the code expects
+ln -sf <ppd.pth>                          code/checkpoints/ppd.pth
+ln -sf <depth_anything_v2_vitl.pth>       code/checkpoints/depth_anything_v2_vitl.pth
+# Hypersim 512² pretrain
 bash code/train_lpd.sh
+# 5-dataset 1024×768 fine-tune (this run's recipe)
 python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml
+# Inference comparison (PPD vs LPD)
 python code/experiments/eval_lpd_vs_ppd.py
 ```
+## Datasets — fetching + extracting
+Everything is hosted as **archives** in the dataset repo to keep file counts low and upload bandwidth high.
+```bash
+# Fetch the data repo (eval sets stay un-archived; only train/* is archive form)
+hf download chenming-wu/LiDAR-Perfect-Depth-Datasets --repo-type dataset \
+    --local-dir /mnt/sig/datasets
+# Run the extractor (un-tars/un-zips everything under datasets/train/)
+bash code/extract_archives.sh /mnt/sig/datasets /mnt/sig/datasets/archives
+```
+Layout after extraction matches what the LPD configs reference (`/mnt/sig/datasets/train/<scene>/...`, `/mnt/sig/datasets/eval_image/...`).
 ## Citations
+- **Pixel-Perfect Depth** — Xu et al., NeurIPS 2025
+- **LiDAR-Perfect Depth** — `paper.tex` in this repo