chenming-wu commited on
Commit
d0edd32
Β·
verified Β·
1 Parent(s): c0e6e24

update README with archive-based dataset layout

Browse files
Files changed (1) hide show
  1. README.md +50 -32
README.md CHANGED
@@ -11,66 +11,84 @@ tags:
11
 
12
  # LiDAR-Perfect Depth (LPD)
13
 
14
- Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth* on top of the public Pixel-Perfect Depth (PPD) codebase.
 
 
 
 
 
 
 
15
 
16
  ## What's in this repo
17
 
18
- - `code/` β€” full repo (LPD additions live under `ppd/lpd/`, plus updated configs and adapter data loaders).
19
  - `checkpoints/`
20
- - `e000-s001000.ckpt` … `e004-s005000.ckpt` β€” per-epoch checkpoints from the 5-dataset 1024Γ—768 fine-tune
21
- - `last.ckpt` β€” rolling latest (= e004 here)
22
- - All 2.0 GB each, weights-only.
23
- - `inference_vis/` β€” 8-sample qualitative comparisons (RGB | GT | PPD | LPD | LPD-variance) generated with `experiments/eval_lpd_vs_ppd.py`.
 
24
 
25
  ## Training run summary
26
 
27
  ```
28
- Backbone: PPD (DA-V2 semantics) β€” 820 M params, frozen
29
- Trainable: 16 M (sparse_prompt_encoder + prompt_gate)
30
- Resolution: 1024 Γ— 768
31
- Batch size: 18 (~119 GB GPU peak on H200)
32
- Steps: 5,000 (5 epochs Γ— 1000 batches)
33
- Mix ratios: Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
34
- Init: official PPD checkpoint (gangweix/Pixel-Perfect-Depth)
35
-
36
- Epoch loss: e0=0.0186 β†’ e4=0.0177 (-4.8% over 5 epochs)
37
  ```
38
 
39
- The official paper trains for many more steps; this checkpoint is a partial-train demo to show the pipeline works end-to-end.
40
 
41
- ## Verification suite
42
 
43
  ```bash
44
  cd code/
45
- python -m ppd.lpd.tests.verify_paper
46
- # 30 paper claims tested β€” all should pass.
47
  ```
48
 
49
- Maps every section of `paper.tex` to a code line. See `PAPER_CHECKLIST.md` for the per-claim table.
50
 
51
- ## Reproduction
52
 
53
  ```bash
54
- pip install -r code/requirements.txt
55
- # Symlink official weights
56
- ln -sf <PPD ckpt> code/checkpoints/ppd.pth
57
- ln -sf <DA-V2 ViT-L ckpt> code/checkpoints/depth_anything_v2_vitl.pth
58
 
59
- # Stage 1 β€” Hypersim 512Β² pretrain
60
  bash code/train_lpd.sh
61
 
62
- # Stage 2 β€” 5-dataset 1024Γ—768 fine-tune (this run's recipe)
63
  python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml
64
 
65
- # Inference comparison
66
  python code/experiments/eval_lpd_vs_ppd.py
67
  ```
68
 
69
- ## Datasets
 
 
 
 
 
 
 
 
 
 
 
70
 
71
- Used the LFS dataset companion: [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets).
72
 
73
  ## Citations
74
 
75
- - Pixel-Perfect Depth β€” Xu et al., NeurIPS 2025
76
- - LiDAR-Perfect Depth β€” paper.tex (anonymous submission)
 
11
 
12
  # LiDAR-Perfect Depth (LPD)
13
 
14
+ Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth* on top of [Pixel-Perfect Depth](https://github.com/gangweix/pixel-perfect-depth).
15
+
16
+ ## Repos
17
+
18
+ | Repo | Contents | Size |
19
+ |---|---|---|
20
+ | `chenming-wu/LiDAR-Perfect-Depth` (this) | code + 6 LPD checkpoints + inference vis + extraction helper | 12.5 GB |
21
+ | [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets) | extracted eval sets + training-set archives + PPD/DA-V2/RAFT weights | ~993 GB |
22
 
23
  ## What's in this repo
24
 
25
+ - `code/` β€” full LPD codebase. New modules under `ppd/lpd/`; updated configs and adapter loaders.
26
  - `checkpoints/`
27
+ - `e000-s001000.ckpt` … `e004-s005000.ckpt` β€” per-epoch fine-tuned weights
28
+ - `last.ckpt` β€” same as e004
29
+ - 2.0 GB each, weights-only; load with `pipeline.dit.load_state_dict(strip_prefix("pipeline.", ckpt["state_dict"]), strict=False)`
30
+ - `inference_vis/` β€” 8-sample qualitative panels (RGB | GT | PPD | LPD | LPD-variance)
31
+ - `extract_archives.sh` β€” extracts the dataset archives back into the layout the code expects
32
 
33
  ## Training run summary
34
 
35
  ```
36
+ Backbone: PPD (DA-V2 ViT-L semantics) β€” 820 M, frozen
37
+ Trainable: 16 M (sparse-prompt encoder + gate)
38
+ Resolution: 1024 Γ— 768
39
+ Batch: 18 (~133 GB peak on a single H200)
40
+ Steps: 5,000 (5 epochs Γ— 1000 batches)
41
+ Mix: Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
42
+ Init: gangweix/Pixel-Perfect-Depth ppd.pth
43
+
44
+ epoch 0 β†’ 4 loss: 0.0186 β†’ 0.0177 (-4.8%)
45
  ```
46
 
47
+ The 5 K steps shown here is a partial-train demo β€” paper-scale training would push it much further.
48
 
49
+ ## Verification
50
 
51
  ```bash
52
  cd code/
53
+ pip install -r requirements.txt
54
+ python -m ppd.lpd.tests.verify_paper # 30 paper claims, all pass
55
  ```
56
 
57
+ `PAPER_CHECKLIST.md` maps each section of the paper to specific code files/lines.
58
 
59
+ ## Reproducing training
60
 
61
  ```bash
62
+ # Pretrained inputs the code expects
63
+ ln -sf <ppd.pth> code/checkpoints/ppd.pth
64
+ ln -sf <depth_anything_v2_vitl.pth> code/checkpoints/depth_anything_v2_vitl.pth
 
65
 
66
+ # Hypersim 512Β² pretrain
67
  bash code/train_lpd.sh
68
 
69
+ # 5-dataset 1024Γ—768 fine-tune (this run's recipe)
70
  python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml
71
 
72
+ # Inference comparison (PPD vs LPD)
73
  python code/experiments/eval_lpd_vs_ppd.py
74
  ```
75
 
76
+ ## Datasets β€” fetching + extracting
77
+
78
+ Everything is hosted as **archives** in the dataset repo to keep file counts low and upload bandwidth high.
79
+
80
+ ```bash
81
+ # Fetch the data repo (eval sets stay un-archived; only train/* is archive form)
82
+ hf download chenming-wu/LiDAR-Perfect-Depth-Datasets --repo-type dataset \
83
+ --local-dir /mnt/sig/datasets
84
+
85
+ # Run the extractor (un-tars/un-zips everything under datasets/train/)
86
+ bash code/extract_archives.sh /mnt/sig/datasets /mnt/sig/datasets/archives
87
+ ```
88
 
89
+ Layout after extraction matches what the LPD configs reference (`/mnt/sig/datasets/train/<scene>/...`, `/mnt/sig/datasets/eval_image/...`).
90
 
91
  ## Citations
92
 
93
+ - **Pixel-Perfect Depth** β€” Xu et al., NeurIPS 2025
94
+ - **LiDAR-Perfect Depth** β€” `paper.tex` in this repo