chenming-wu
/

LiDAR-Perfect-Depth

Depth Estimation

monocular-depth

pixel-perfect-depth

Model card Files Files and versions

LiDAR-Perfect-Depth / code /HF_README.md

chenming-wu's picture

code

436b829 verified 4 days ago

|

history blame contribute delete

2.47 kB

	---
	license: mit
	tags:
	- depth-estimation
	- diffusion
	- monocular-depth
	- lidar-prompted
	- pixel-perfect-depth
	- pytorch
	---

	# LiDAR-Perfect Depth (LPD)

	Implementation of LiDAR-Perfect Depth: Score-Decomposed Diffusion with Kalman-in-the-Loop Denoising for Sparse-Prompted Depth on top of the public Pixel-Perfect Depth (PPD) codebase.

	## What's in this repo

	- `code/` — full repo (LPD additions live under `ppd/lpd/`, plus updated configs and adapter data loaders).
	- `checkpoints/`
	- `e000-s001000.ckpt` … `e004-s005000.ckpt` — per-epoch checkpoints from the 5-dataset 1024×768 fine-tune
	- `last.ckpt` — rolling latest (= e004 here)
	- All 2.0 GB each, weights-only.
	- `inference_vis/` — 8-sample qualitative comparisons (RGB \| GT \| PPD \| LPD \| LPD-variance) generated with `experiments/eval_lpd_vs_ppd.py`.

	## Training run summary

	```
	Backbone: PPD (DA-V2 semantics) — 820 M params, frozen
	Trainable: 16 M (sparse_prompt_encoder + prompt_gate)
	Resolution: 1024 × 768
	Batch size: 18 (~119 GB GPU peak on H200)
	Steps: 5,000 (5 epochs × 1000 batches)
	Mix ratios: Hypersim 0.5 / UrbanSyn 0.15 / UnrealStereo4K 0.15 / VKITTI2 0.1 / TartanAir 0.1
	Init: official PPD checkpoint (gangweix/Pixel-Perfect-Depth)

	Epoch loss: e0=0.0186 → e4=0.0177 (-4.8% over 5 epochs)
	```

	The official paper trains for many more steps; this checkpoint is a partial-train demo to show the pipeline works end-to-end.

	## Verification suite

	```bash
	cd code/
	python -m ppd.lpd.tests.verify_paper
	# 30 paper claims tested — all should pass.
	```

	Maps every section of `paper.tex` to a code line. See `PAPER_CHECKLIST.md` for the per-claim table.

	## Reproduction

	```bash
	pip install -r code/requirements.txt
	# Symlink official weights
	ln -sf <PPD ckpt> code/checkpoints/ppd.pth
	ln -sf <DA-V2 ViT-L ckpt> code/checkpoints/depth_anything_v2_vitl.pth

	# Stage 1 — Hypersim 512² pretrain
	bash code/train_lpd.sh

	# Stage 2 — 5-dataset 1024×768 fine-tune (this run's recipe)
	python code/main.py --cfg_file code/ppd/configs/lpd_run5d_10k.yaml

	# Inference comparison
	python code/experiments/eval_lpd_vs_ppd.py
	```

	## Datasets

	Used the LFS dataset companion: [`chenming-wu/LiDAR-Perfect-Depth-Datasets`](https://huggingface.co/datasets/chenming-wu/LiDAR-Perfect-Depth-Datasets).

	## Citations

	- Pixel-Perfect Depth — Xu et al., NeurIPS 2025
	- LiDAR-Perfect Depth — paper.tex (anonymous submission)