chenming-wu commited on
Commit
7fbb80e
Β·
verified Β·
1 Parent(s): 04c2055

README: document MoGe2 backbone option

Browse files
Files changed (1) hide show
  1. README.md +29 -12
README.md CHANGED
@@ -17,23 +17,34 @@ Implementation of *LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-i
17
 
18
  | Repo | Contents | Size |
19
  |---|---|---|
20
- | `chenming-wu/LiDAR-Perfect-Depth` (this) | code + 6 LPD checkpoints + inference vis + extraction helper | 12.5 GB |
21
- | [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets) | extracted eval sets + training-set archives + PPD/DA-V2/RAFT weights | ~993 GB |
22
 
23
  ## What's in this repo
24
 
25
  - `code/` β€” full LPD codebase. New modules under `ppd/lpd/`; updated configs and adapter loaders.
26
  - `checkpoints/`
27
- - `e000-s001000.ckpt` … `e004-s005000.ckpt` β€” per-epoch fine-tuned weights
28
  - `last.ckpt` β€” same as e004
29
- - 2.0 GB each, weights-only; load with `pipeline.dit.load_state_dict(strip_prefix("pipeline.", ckpt["state_dict"]), strict=False)`
30
  - `inference_vis/` β€” 8-sample qualitative panels (RGB | GT | PPD | LPD | LPD-variance)
31
  - `extract_archives.sh` β€” extracts the dataset archives back into the layout the code expects
32
 
33
- ## Training run summary
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ```
36
- Backbone: PPD (DA-V2 ViT-L semantics) β€” 820 M, frozen
37
  Trainable: 16 M (sparse-prompt encoder + gate)
38
  Resolution: 1024 Γ— 768
39
  Batch: 18 (~133 GB peak on a single H200)
@@ -46,6 +57,9 @@ epoch 0 β†’ 4 loss: 0.0186 β†’ 0.0177 (-4.8%)
46
 
47
  The 5 K steps shown here is a partial-train demo β€” paper-scale training would push it much further.
48
 
 
 
 
49
  ## Verification
50
 
51
  ```bash
@@ -60,14 +74,19 @@ python -m ppd.lpd.tests.verify_paper # 30 paper claims, all pass
60
 
61
  ```bash
62
  # Pretrained inputs the code expects
63
- ln -sf <ppd.pth> code/checkpoints/ppd.pth
64
  ln -sf <depth_anything_v2_vitl.pth> code/checkpoints/depth_anything_v2_vitl.pth
 
 
 
65
 
66
- # Hypersim 512Β² pretrain
67
  bash code/train_lpd.sh
68
 
69
- # 5-dataset 1024Γ—768 fine-tune (this run's recipe)
70
- python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml
 
 
71
 
72
  # Inference comparison (PPD vs LPD)
73
  python code/experiments/eval_lpd_vs_ppd.py
@@ -78,11 +97,9 @@ python code/experiments/eval_lpd_vs_ppd.py
78
  Everything is hosted as **archives** in the dataset repo to keep file counts low and upload bandwidth high.
79
 
80
  ```bash
81
- # Fetch the data repo (eval sets stay un-archived; only train/* is archive form)
82
  hf download chenming-wu/LiDAR-Perfect-Depth-Datasets --repo-type dataset \
83
  --local-dir /mnt/sig/datasets
84
 
85
- # Run the extractor (un-tars/un-zips everything under datasets/train/)
86
  bash code/extract_archives.sh /mnt/sig/datasets /mnt/sig/datasets/archives
87
  ```
88
 
 
17
 
18
  | Repo | Contents | Size |
19
  |---|---|---|
20
+ | `chenming-wu/LiDAR-Perfect-Depth` (this) | code + 6 LPD-DA2 checkpoints + inference vis + extraction helper | 12.5 GB |
21
+ | [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets) | extracted eval sets + training-set archives + PPD/DA-V2/RAFT weights | ~991 GB |
22
 
23
  ## What's in this repo
24
 
25
  - `code/` β€” full LPD codebase. New modules under `ppd/lpd/`; updated configs and adapter loaders.
26
  - `checkpoints/`
27
+ - `e000-s001000.ckpt` … `e004-s005000.ckpt` β€” per-epoch fine-tuned weights (DA2 backbone, 5K steps)
28
  - `last.ckpt` β€” same as e004
29
+ - 2.0 GB each, weights-only
30
  - `inference_vis/` β€” 8-sample qualitative panels (RGB | GT | PPD | LPD | LPD-variance)
31
  - `extract_archives.sh` β€” extracts the dataset archives back into the layout the code expects
32
 
33
+ ## Backbone options
34
+
35
+ PPD (and therefore LPD) support two semantic-prompt backbones β€” switching is a single config change.
36
+
37
+ | Backbone | Trainable params | PPD ckpt | Config |
38
+ |---|---|---|---|
39
+ | **DA2** (Depth-Anything-V2 ViT-L) | 16.3 M | [`gangweix/Pixel-Perfect-Depth/ppd.pth`](https://huggingface.co/gangweix/Pixel-Perfect-Depth/resolve/main/ppd.pth) | `code/ppd/configs/lpd_run5d_10k.yaml` |
40
+ | **MoGe2** | 16.3 M | [Google Drive](https://drive.google.com/file/d/1tabmcsbRVDKDfmO4KU1vOjurzN-wp0HV/view?usp=sharing) | `code/ppd/configs/lpd_run5d_moge2.yaml` |
41
+
42
+ The official PPD MoGe2 release reports 20-30 % improvement over DA2 on zero-shot benchmarks; the LPD prompt branch is identical for either backbone.
43
+
44
+ ## Training run summary (DA2 backbone, 5K steps)
45
 
46
  ```
47
+ Backbone: PPD-DA2 β€” 820 M, frozen
48
  Trainable: 16 M (sparse-prompt encoder + gate)
49
  Resolution: 1024 Γ— 768
50
  Batch: 18 (~133 GB peak on a single H200)
 
57
 
58
  The 5 K steps shown here is a partial-train demo β€” paper-scale training would push it much further.
59
 
60
+ For MoGe2 backbone we verified the full forward + backward + checkpointing pipeline works
61
+ (`lpd_run5d_moge2.yaml`); training to convergence is left for a multi-GPU run.
62
+
63
  ## Verification
64
 
65
  ```bash
 
74
 
75
  ```bash
76
  # Pretrained inputs the code expects
77
+ ln -sf <ppd.pth> code/checkpoints/ppd.pth # DA2
78
  ln -sf <depth_anything_v2_vitl.pth> code/checkpoints/depth_anything_v2_vitl.pth
79
+ # OR for MoGe2:
80
+ ln -sf <ppd_moge2.pth> code/checkpoints/ppd_moge2.pth
81
+ ln -sf <moge2.pt> code/checkpoints/moge2.pt
82
 
83
+ # Hypersim 512Β² pretrain (DA2)
84
  bash code/train_lpd.sh
85
 
86
+ # 5-dataset 1024Γ—768 fine-tune
87
+ python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml # DA2
88
+ # OR
89
+ python code/main.py --cfg_file code/ppd/configs/lpd_run5d_moge2.yaml # MoGe2
90
 
91
  # Inference comparison (PPD vs LPD)
92
  python code/experiments/eval_lpd_vs_ppd.py
 
97
  Everything is hosted as **archives** in the dataset repo to keep file counts low and upload bandwidth high.
98
 
99
  ```bash
 
100
  hf download chenming-wu/LiDAR-Perfect-Depth-Datasets --repo-type dataset \
101
  --local-dir /mnt/sig/datasets
102
 
 
103
  bash code/extract_archives.sh /mnt/sig/datasets /mnt/sig/datasets/archives
104
  ```
105