| --- |
| license: other |
| language: |
| - en |
| tags: |
| - depth-completion |
| - transparent-objects |
| - robotics |
| - cleargrasp |
| - lidf |
| - rftrans |
| - transdiff |
| library_name: pytorch |
| --- |
| |
| # GridDepth: Pretrained Checkpoints for Transparent-Object Depth Completion |
|
|
| This repo hosts the **pretrained checkpoints** that go with the |
| [atom525/ProgressiveDepth](https://github.com/atom525/ProgressiveDepth) codebase |
| (idea.md series-joint pipeline: TransDiff Refined1 β LIDF) **plus** our local |
| **RFTrans reproduction** baselines. |
|
|
| > **Recipe**: see [atom525/ProgressiveDepth README.md](https://github.com/atom525/ProgressiveDepth) |
| > and [docs/PIPELINE.md](https://github.com/atom525/ProgressiveDepth/blob/main/docs/PIPELINE.md). |
|
|
| --- |
|
|
| ## File layout |
|
|
| ``` |
| GridDepth/ |
| βββ progressivedepth/ # idea.md δΈ»ηΊΏοΌModule A=ip_basic + Module B=LIDFοΌ |
| β βββ ckpts/ |
| β β βββ lidf_stage1_epoch059.pth # 248 MB β LIDF Stage 1 (frozen baseline, CG-only Adam 60 ep) |
| β β βββ C_stage2_epoch029.pth # 2.2 MB β Stage 2 RefineNet, retrained on Refined1 input (idea.md C run) |
| β β βββ C_stage3_epoch029.pth # 2.2 MB β Stage 3 RefineNet hard-neg, retrained on Refined1 input |
| β βββ configs/ |
| β βββ train_progressive_stage2.yaml |
| β βββ train_progressive_stage3.yaml |
| β βββ pipeline_config.yaml # inference / evaluate config |
| β |
| βββ rftrans/ # RFTrans ε€η°δΊ§η© |
| βββ ckpts/ |
| β βββ rfnet_refractive_flow_epoch500.pth # 467 MB β RFNet (DRN backbone), Adam 500 ep on unity/train |
| β βββ f2net_flow2normal_epoch500.pth # 356 MB β F2Net (simple_unet), Adam 500 ep on unity/train |
| β βββ mask_adam_epoch195.pth # 312 MB β mask network (DRN), Adam 200 ep on unity/train, mIoU 0.847 |
| β βββ outlines_side_adam_epoch195.pth # 312 MB β boundary network (DRN side-output), Adam 200 ep on unity/train |
| βββ configs/ |
| βββ refractive_flow_config.yaml # RFNet train config (Adam, 500 ep) |
| βββ flow2normal_config.yaml # F2Net train config (Adam, 500 ep) |
| βββ mask_adam_config.yaml # mask train config (Adam, 200 ep) |
| βββ outlines_side_adam_config.yaml # boundary train config (Adam, 200 ep) |
| βββ exp017_paperfaithful.yaml # rgb2normal e2e config (paper-faithful: SGD 100 ep, lr=1e-4 mom=0.9 wd=5e-4) |
| ``` |
|
|
| --- |
|
|
| ## ProgressiveDepth (idea.md series-joint pipeline) |
|
|
| Pipeline: |
| ``` |
| RGB + Noisy Depth |
| β |
| βΌ Module A: TransDiff Data Preprocessing (ip_basic ε€ε°ΊεΊ¦ε½’ζε¦ε‘«ε
) |
| Refined Depth1 |
| β |
| βΌ Module B: LIDF (Stage 1 frozen + Stage 2 / 3 retrained on Refined1) |
| Final Depth |
| ``` |
|
|
| ### Final results (paper protocol: 256Γ144 + per-image avg + corrupt mask) |
|
|
| C_full = `lidf_stage1_epoch059.pth` + `C_stage2_epoch029.pth` + `C_stage3_epoch029.pth`οΌevaluation η¨ mode A (feed_to_lidf=refined1): |
| |
| | Dataset | C_full RMSEβ | C_full Ξ΄1.05β | B baseline RMSE | B baseline Ξ΄1.05 | LIDF paper Table 1 | |
| |---|---:|---:|---:|---:|---:| |
| | **real-test (Real-novel)** β | **0.0403** | **45.28** | 0.0443 | 40.18 | 0.0250 / 76.21 | |
| | real-val (Real-known) | 0.0351 | 77.22 | 0.0358 | 77.18 | 0.0280 / 82.37 | |
| | synthetic-test (Syn-novel) | 0.0328 | 62.82 | 0.0305 | 66.12 | 0.0280 / 68.62 | |
| | synthetic-val (Syn-known) | 0.0129 | 93.72 | 0.0111 | 96.07 | 0.0120 / 94.79 | |
| |
| **Conclusion**: idea.md series-joint approach is **effective on real-world data** (Real-novel RMSE β9%, Ξ΄1.05 β5 pts vs baseline B), **regression on synthetic** (where ip_basic adds noise to clean inputs). The remaining gap to paper Table 1 is due to Omniverse Object Dataset being unavailable (link broken since 2025-03, [NVlabs/implicit_depth#3](https://github.com/NVlabs/implicit_depth/issues/3)). |
|
|
| --- |
|
|
| ## RFTrans reproduction |
|
|
| Pipeline (per RFTrans paper Β§III-C): |
| ``` |
| RGB ββ> RFNet ββ> refractive flow + mask + boundary |
| β |
| βββ> F2Net ββ> surface normal |
| β |
| βββ> depth2depth global opt ββ> Refined Depth |
| ``` |
|
|
| ### Caveats |
|
|
| 1. **Architecture deviation**: paper Β§III-C says "RFNet predicts mask, boundary, and refractive flow" (multi-task), but the official repo doesn't implement this. We trained **separate networks** (RFNet predicts only flow, F2Net predicts normal from flow, mask & boundary as independent DeepLab+DRN networks) β this matches the actual repo structure but not the paper text. |
| 2. **Optimizer deviation**: paper Β§IV-A specifies SGD lr=1e-4 momentum=0.9 weight_decay=5e-4 for 100 epochs. We used **Adam** for sub-network training because we empirically found SGD lr=1e-4 from random init **does not converge** (mask val mIoU ~0.46 = random level after 100 ep SGD vs 0.85 with Adam 200 ep). The provided `exp017_paperfaithful.yaml` IS paper-faithful (SGD 100 ep) β used for the **end-to-end fine-tuning stage**, where it warm-starts from the Adam-trained RFNet/F2Net. |
| 3. **Training data**: all networks trained on `data/unity/train/` (5000 RGB + flow + mask + boundary + normal GT, generated with [Unity-RefractiveFlowRender](https://github.com/LJY-XCX/Unity-RefractiveFlowRender)) β this is the dataset specified by RFTrans paper Β§IV-A. |
|
|
| ### How to use these RFTrans ckpts |
|
|
| In your `RFTrans/eval_depth_completion/config_*.yaml`: |
| ```yaml |
| rgb2flow: |
| pathWeightsFile: <path_to>/rfnet_refractive_flow_epoch500.pth |
| flow2normal: |
| pathWeightsFile: <path_to>/f2net_flow2normal_epoch500.pth |
| masks: |
| pathWeightsFile: <path_to>/mask_adam_epoch195.pth # OR cleargrasp_orig/.../checkpoint_mask.pth |
| outlines: |
| pathWeightsFile: <path_to>/outlines_side_adam_epoch195.pth # OR cleargrasp_orig/.../checkpoint_outlines.pth |
| ``` |
|
|
| --- |
|
|
| ## Environment / dependencies |
|
|
| - python 3.8, pytorch 2.0.0+cu118 |
| - LIDF: see [implicit_depth/requirements.txt](https://github.com/atom525/ProgressiveDepth/blob/main/implicit_depth/requirements.txt) |
| - RFTrans: needs `depth2depth` C++ binary and `libhdf5.so` from conda env |
|
|
| ## License |
|
|
| - LIDF Stage 1 ckpt and code: NVIDIA Source Code License (Non-Commercial), inherited from [NVlabs/implicit_depth](https://github.com/NVlabs/implicit_depth) |
| - RFTrans ckpts and code: inherited from [LJY-XCX/RFTrans](https://github.com/LJY-XCX/RFTrans) license |
| - Our extensions (transdiff_preprocess wrapper, train_progressive trainer, retrains): same as upstream |
|
|
| ## Citation |
|
|
| If you use these ckpts please cite the original works: |
|
|
| ```bibtex |
| @inproceedings{zhu2021rgbd, |
| title={RGB-D Local Implicit Function for Depth Completion of Transparent Objects}, |
| author={Zhu, Luyang and Mousavian, Arsalan and Xiang, Yu and Mazhar, Hammad and van Eenbergen, Jozef and Debnath, Shoubhik and Fox, Dieter}, |
| booktitle={CVPR}, |
| year={2021} |
| } |
| |
| @article{tang2024rftrans, |
| title={RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface Normal Estimation and Manipulation}, |
| author={Tang, Tutian and Liu, Jiyu and Zhang, Jieyi and Fu, Haoyuan and Xu, Wenqiang and Lu, Cewu}, |
| journal={IEEE Robotics and Automation Letters}, |
| year={2024} |
| } |
| ``` |
|
|