chenming-wu
/

LiDAR-Perfect-Depth

Depth Estimation

monocular-depth

pixel-perfect-depth

Model card Files Files and versions

LiDAR-Perfect-Depth / code /train_lpd.sh

chenming-wu's picture

code

436b829 verified 5 days ago

history blame contribute delete

1.09 kB

	#!/bin/bash
	# Train LiDAR-Perfect Depth on top of a pretrained PPD checkpoint.
	#
	# Stage 1 (image pretrain @512²): trains only the sparse-prompt encoder + gate
	# on Hypersim with simulated sparse-LiDAR; DiT backbone frozen.
	# Stage 2 (image finetune @1024×768): unfreezes the DiT (optional) and mixes
	# Hypersim + UrbanSyn + UnrealStereo4K + VKITTI2 + TartanAir.
	# Stage 3 (video finetune): adds the temporal Kalman filter loop on short clips.
	#
	# Pre-reqs:
	# * checkpoints/ppd.pth <- PPD pretrained weights
	# * checkpoints/depth_anything_v2_vitl.pth <- DA-V2 ViT-L semantics
	# * datasets extracted under /mnt/sig/datasets/ <- see datasets/README.md

	set -e

	# Stage 1: image pretrain (Hypersim only, 512×512)
	python main.py --cfg_file ppd/configs/lpd_pretrain.yaml pl_trainer.devices=1

	# Stage 2 (uncomment after stage 1 produces a checkpoint):
	# python main.py --cfg_file ppd/configs/lpd_finetune.yaml pl_trainer.devices=8

	# Stage 3 (uncomment for video):
	# python main.py --cfg_file ppd/configs/lpd_video.yaml pl_trainer.devices=8