update README with archive-based dataset layout
Browse files
README.md
CHANGED
|
@@ -11,66 +11,84 @@ tags:
|
|
| 11 |
|
| 12 |
# LiDAR-Perfect Depth (LPD)
|
| 13 |
|
| 14 |
-
Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth* on top of
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
## What's in this repo
|
| 17 |
|
| 18 |
-
- `code/` β full
|
| 19 |
- `checkpoints/`
|
| 20 |
-
- `e000-s001000.ckpt` β¦ `e004-s005000.ckpt` β per-epoch
|
| 21 |
-
- `last.ckpt` β
|
| 22 |
-
-
|
| 23 |
-
- `inference_vis/` β 8-sample qualitative
|
|
|
|
| 24 |
|
| 25 |
## Training run summary
|
| 26 |
|
| 27 |
```
|
| 28 |
-
Backbone:
|
| 29 |
-
Trainable:
|
| 30 |
-
Resolution:
|
| 31 |
-
Batch
|
| 32 |
-
Steps:
|
| 33 |
-
Mix
|
| 34 |
-
Init:
|
| 35 |
-
|
| 36 |
-
|
| 37 |
```
|
| 38 |
|
| 39 |
-
The
|
| 40 |
|
| 41 |
-
## Verification
|
| 42 |
|
| 43 |
```bash
|
| 44 |
cd code/
|
| 45 |
-
|
| 46 |
-
# 30 paper claims
|
| 47 |
```
|
| 48 |
|
| 49 |
-
|
| 50 |
|
| 51 |
-
##
|
| 52 |
|
| 53 |
```bash
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
ln -sf <
|
| 57 |
-
ln -sf <DA-V2 ViT-L ckpt> code/checkpoints/depth_anything_v2_vitl.pth
|
| 58 |
|
| 59 |
-
#
|
| 60 |
bash code/train_lpd.sh
|
| 61 |
|
| 62 |
-
#
|
| 63 |
python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml
|
| 64 |
|
| 65 |
-
# Inference comparison
|
| 66 |
python code/experiments/eval_lpd_vs_ppd.py
|
| 67 |
```
|
| 68 |
|
| 69 |
-
## Datasets
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
-
|
| 72 |
|
| 73 |
## Citations
|
| 74 |
|
| 75 |
-
- Pixel-Perfect Depth β Xu et al., NeurIPS 2025
|
| 76 |
-
- LiDAR-Perfect Depth β paper.tex
|
|
|
|
| 11 |
|
| 12 |
# LiDAR-Perfect Depth (LPD)
|
| 13 |
|
| 14 |
+
Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth* on top of [Pixel-Perfect Depth](https://github.com/gangweix/pixel-perfect-depth).
|
| 15 |
+
|
| 16 |
+
## Repos
|
| 17 |
+
|
| 18 |
+
| Repo | Contents | Size |
|
| 19 |
+
|---|---|---|
|
| 20 |
+
| `chenming-wu/LiDAR-Perfect-Depth` (this) | code + 6 LPD checkpoints + inference vis + extraction helper | 12.5 GB |
|
| 21 |
+
| [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets) | extracted eval sets + training-set archives + PPD/DA-V2/RAFT weights | ~993 GB |
|
| 22 |
|
| 23 |
## What's in this repo
|
| 24 |
|
| 25 |
+
- `code/` β full LPD codebase. New modules under `ppd/lpd/`; updated configs and adapter loaders.
|
| 26 |
- `checkpoints/`
|
| 27 |
+
- `e000-s001000.ckpt` β¦ `e004-s005000.ckpt` β per-epoch fine-tuned weights
|
| 28 |
+
- `last.ckpt` β same as e004
|
| 29 |
+
- 2.0 GB each, weights-only; load with `pipeline.dit.load_state_dict(strip_prefix("pipeline.", ckpt["state_dict"]), strict=False)`
|
| 30 |
+
- `inference_vis/` β 8-sample qualitative panels (RGB | GT | PPD | LPD | LPD-variance)
|
| 31 |
+
- `extract_archives.sh` β extracts the dataset archives back into the layout the code expects
|
| 32 |
|
| 33 |
## Training run summary
|
| 34 |
|
| 35 |
```
|
| 36 |
+
Backbone: PPD (DA-V2 ViT-L semantics) β 820 M, frozen
|
| 37 |
+
Trainable: 16 M (sparse-prompt encoder + gate)
|
| 38 |
+
Resolution: 1024 Γ 768
|
| 39 |
+
Batch: 18 (~133 GB peak on a single H200)
|
| 40 |
+
Steps: 5,000 (5 epochs Γ 1000 batches)
|
| 41 |
+
Mix: Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
|
| 42 |
+
Init: gangweix/Pixel-Perfect-Depth ppd.pth
|
| 43 |
+
|
| 44 |
+
epoch 0 β 4 loss: 0.0186 β 0.0177 (-4.8%)
|
| 45 |
```
|
| 46 |
|
| 47 |
+
The 5 K steps shown here is a partial-train demo β paper-scale training would push it much further.
|
| 48 |
|
| 49 |
+
## Verification
|
| 50 |
|
| 51 |
```bash
|
| 52 |
cd code/
|
| 53 |
+
pip install -r requirements.txt
|
| 54 |
+
python -m ppd.lpd.tests.verify_paper # 30 paper claims, all pass
|
| 55 |
```
|
| 56 |
|
| 57 |
+
`PAPER_CHECKLIST.md` maps each section of the paper to specific code files/lines.
|
| 58 |
|
| 59 |
+
## Reproducing training
|
| 60 |
|
| 61 |
```bash
|
| 62 |
+
# Pretrained inputs the code expects
|
| 63 |
+
ln -sf <ppd.pth> code/checkpoints/ppd.pth
|
| 64 |
+
ln -sf <depth_anything_v2_vitl.pth> code/checkpoints/depth_anything_v2_vitl.pth
|
|
|
|
| 65 |
|
| 66 |
+
# Hypersim 512Β² pretrain
|
| 67 |
bash code/train_lpd.sh
|
| 68 |
|
| 69 |
+
# 5-dataset 1024Γ768 fine-tune (this run's recipe)
|
| 70 |
python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml
|
| 71 |
|
| 72 |
+
# Inference comparison (PPD vs LPD)
|
| 73 |
python code/experiments/eval_lpd_vs_ppd.py
|
| 74 |
```
|
| 75 |
|
| 76 |
+
## Datasets β fetching + extracting
|
| 77 |
+
|
| 78 |
+
Everything is hosted as **archives** in the dataset repo to keep file counts low and upload bandwidth high.
|
| 79 |
+
|
| 80 |
+
```bash
|
| 81 |
+
# Fetch the data repo (eval sets stay un-archived; only train/* is archive form)
|
| 82 |
+
hf download chenming-wu/LiDAR-Perfect-Depth-Datasets --repo-type dataset \
|
| 83 |
+
--local-dir /mnt/sig/datasets
|
| 84 |
+
|
| 85 |
+
# Run the extractor (un-tars/un-zips everything under datasets/train/)
|
| 86 |
+
bash code/extract_archives.sh /mnt/sig/datasets /mnt/sig/datasets/archives
|
| 87 |
+
```
|
| 88 |
|
| 89 |
+
Layout after extraction matches what the LPD configs reference (`/mnt/sig/datasets/train/<scene>/...`, `/mnt/sig/datasets/eval_image/...`).
|
| 90 |
|
| 91 |
## Citations
|
| 92 |
|
| 93 |
+
- **Pixel-Perfect Depth** β Xu et al., NeurIPS 2025
|
| 94 |
+
- **LiDAR-Perfect Depth** β `paper.tex` in this repo
|