Initial upload: ProgressiveDepth + RFTrans final ckpts and configs

736c1b5 verified about 1 month ago

7.49 kB

	---
	license: other
	language:
	- en
	tags:
	- depth-completion
	- transparent-objects
	- robotics
	- cleargrasp
	- lidf
	- rftrans
	- transdiff
	library_name: pytorch
	---

	# GridDepth: Pretrained Checkpoints for Transparent-Object Depth Completion

	This repo hosts the pretrained checkpoints that go with the
	[atom525/ProgressiveDepth](https://github.com/atom525/ProgressiveDepth) codebase
	(idea.md series-joint pipeline: TransDiff Refined1 → LIDF) plus our local
	RFTrans reproduction baselines.

	> Recipe: see [atom525/ProgressiveDepth README.md](https://github.com/atom525/ProgressiveDepth)
	> and [docs/PIPELINE.md](https://github.com/atom525/ProgressiveDepth/blob/main/docs/PIPELINE.md).

	---

	## File layout

	```
	GridDepth/
	├── progressivedepth/ # idea.md 主线（Module A=ip_basic + Module B=LIDF）
	│ ├── ckpts/
	│ │ ├── lidf_stage1_epoch059.pth # 248 MB — LIDF Stage 1 (frozen baseline, CG-only Adam 60 ep)
	│ │ ├── C_stage2_epoch029.pth # 2.2 MB — Stage 2 RefineNet, retrained on Refined1 input (idea.md C run)
	│ │ └── C_stage3_epoch029.pth # 2.2 MB — Stage 3 RefineNet hard-neg, retrained on Refined1 input
	│ └── configs/
	│ ├── train_progressive_stage2.yaml
	│ ├── train_progressive_stage3.yaml
	│ └── pipeline_config.yaml # inference / evaluate config
	│
	└── rftrans/ # RFTrans 复现产物
	├── ckpts/
	│ ├── rfnet_refractive_flow_epoch500.pth # 467 MB — RFNet (DRN backbone), Adam 500 ep on unity/train
	│ ├── f2net_flow2normal_epoch500.pth # 356 MB — F2Net (simple_unet), Adam 500 ep on unity/train
	│ ├── mask_adam_epoch195.pth # 312 MB — mask network (DRN), Adam 200 ep on unity/train, mIoU 0.847
	│ └── outlines_side_adam_epoch195.pth # 312 MB — boundary network (DRN side-output), Adam 200 ep on unity/train
	└── configs/
	├── refractive_flow_config.yaml # RFNet train config (Adam, 500 ep)
	├── flow2normal_config.yaml # F2Net train config (Adam, 500 ep)
	├── mask_adam_config.yaml # mask train config (Adam, 200 ep)
	├── outlines_side_adam_config.yaml # boundary train config (Adam, 200 ep)
	└── exp017_paperfaithful.yaml # rgb2normal e2e config (paper-faithful: SGD 100 ep, lr=1e-4 mom=0.9 wd=5e-4)
	```

	---

	## ProgressiveDepth (idea.md series-joint pipeline)

	Pipeline:
	```
	RGB + Noisy Depth
	│
	▼ Module A: TransDiff Data Preprocessing (ip_basic 多尺度形态学填充)
	Refined Depth1
	│
	▼ Module B: LIDF (Stage 1 frozen + Stage 2 / 3 retrained on Refined1)
	Final Depth
	```

	### Final results (paper protocol: 256×144 + per-image avg + corrupt mask)

	C_full = `lidf_stage1_epoch059.pth` + `C_stage2_epoch029.pth` + `C_stage3_epoch029.pth`，evaluation 用 mode A (feed_to_lidf=refined1):

	\| Dataset \| C_full RMSE↓ \| C_full δ1.05↑ \| B baseline RMSE \| B baseline δ1.05 \| LIDF paper Table 1 \|
	\|---\|---:\|---:\|---:\|---:\|---:\|
	\| real-test (Real-novel) ⭐ \| 0.0403 \| 45.28 \| 0.0443 \| 40.18 \| 0.0250 / 76.21 \|
	\| real-val (Real-known) \| 0.0351 \| 77.22 \| 0.0358 \| 77.18 \| 0.0280 / 82.37 \|
	\| synthetic-test (Syn-novel) \| 0.0328 \| 62.82 \| 0.0305 \| 66.12 \| 0.0280 / 68.62 \|
	\| synthetic-val (Syn-known) \| 0.0129 \| 93.72 \| 0.0111 \| 96.07 \| 0.0120 / 94.79 \|

	Conclusion: idea.md series-joint approach is effective on real-world data (Real-novel RMSE ↓9%, δ1.05 ↑5 pts vs baseline B), regression on synthetic (where ip_basic adds noise to clean inputs). The remaining gap to paper Table 1 is due to Omniverse Object Dataset being unavailable (link broken since 2025-03, [NVlabs/implicit_depth#3](https://github.com/NVlabs/implicit_depth/issues/3)).

	---

	## RFTrans reproduction

	Pipeline (per RFTrans paper §III-C):
	```
	RGB ──> RFNet ──> refractive flow + mask + boundary
	│
	└──> F2Net ──> surface normal
	│
	└──> depth2depth global opt ──> Refined Depth
	```

	### Caveats

	1. Architecture deviation: paper §III-C says "RFNet predicts mask, boundary, and refractive flow" (multi-task), but the official repo doesn't implement this. We trained separate networks (RFNet predicts only flow, F2Net predicts normal from flow, mask & boundary as independent DeepLab+DRN networks) — this matches the actual repo structure but not the paper text.
	2. Optimizer deviation: paper §IV-A specifies SGD lr=1e-4 momentum=0.9 weight_decay=5e-4 for 100 epochs. We used Adam for sub-network training because we empirically found SGD lr=1e-4 from random init does not converge (mask val mIoU ~0.46 = random level after 100 ep SGD vs 0.85 with Adam 200 ep). The provided `exp017_paperfaithful.yaml` IS paper-faithful (SGD 100 ep) — used for the end-to-end fine-tuning stage, where it warm-starts from the Adam-trained RFNet/F2Net.
	3. Training data: all networks trained on `data/unity/train/` (5000 RGB + flow + mask + boundary + normal GT, generated with [Unity-RefractiveFlowRender](https://github.com/LJY-XCX/Unity-RefractiveFlowRender)) — this is the dataset specified by RFTrans paper §IV-A.

	### How to use these RFTrans ckpts

	In your `RFTrans/eval_depth_completion/config_*.yaml`:
	```yaml
	rgb2flow:
	pathWeightsFile: <path_to>/rfnet_refractive_flow_epoch500.pth
	flow2normal:
	pathWeightsFile: <path_to>/f2net_flow2normal_epoch500.pth
	masks:
	pathWeightsFile: <path_to>/mask_adam_epoch195.pth # OR cleargrasp_orig/.../checkpoint_mask.pth
	outlines:
	pathWeightsFile: <path_to>/outlines_side_adam_epoch195.pth # OR cleargrasp_orig/.../checkpoint_outlines.pth
	```

	---

	## Environment / dependencies

	- python 3.8, pytorch 2.0.0+cu118
	- LIDF: see [implicit_depth/requirements.txt](https://github.com/atom525/ProgressiveDepth/blob/main/implicit_depth/requirements.txt)
	- RFTrans: needs `depth2depth` C++ binary and `libhdf5.so` from conda env

	## License

	- LIDF Stage 1 ckpt and code: NVIDIA Source Code License (Non-Commercial), inherited from [NVlabs/implicit_depth](https://github.com/NVlabs/implicit_depth)
	- RFTrans ckpts and code: inherited from [LJY-XCX/RFTrans](https://github.com/LJY-XCX/RFTrans) license
	- Our extensions (transdiff_preprocess wrapper, train_progressive trainer, retrains): same as upstream

	## Citation

	If you use these ckpts please cite the original works:

	```bibtex
	@inproceedings{zhu2021rgbd,
	title={RGB-D Local Implicit Function for Depth Completion of Transparent Objects},
	author={Zhu, Luyang and Mousavian, Arsalan and Xiang, Yu and Mazhar, Hammad and van Eenbergen, Jozef and Debnath, Shoubhik and Fox, Dieter},
	booktitle={CVPR},
	year={2021}
	}

	@article{tang2024rftrans,
	title={RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface Normal Estimation and Manipulation},
	author={Tang, Tutian and Liu, Jiyu and Zhang, Jieyi and Fu, Haoyuan and Xu, Wenqiang and Lu, Cewu},
	journal={IEEE Robotics and Automation Letters},
	year={2024}
	}
	```