LingBot-Depth
Collection
3 items
•
Updated
•
5
LingBot-Depth transforms incomplete and noisy depth sensor data into high-quality, metric-accurate 3D measurements. This is the general-purpose pretrained model for depth refinement tasks.
LingBot-Depth employs a masked depth modeling (MDM) approach that treats missing depth measurements from RGB-D sensors not as noise, but as a natural masking signal that highlights geometric ambiguities. The model learns joint representations from RGB appearance context and valid depth observations, enabling robust depth reasoning under incomplete observations.
| Model | Hugging Face Model | ModelScope Model | Description |
|---|---|---|---|
| LingBot-Depth | robbyant/lingbot-depth-pretrain-vitl-14 | robbyant/lingbot-depth-pretrain-vitl-14 | General-purpose depth refinement |
| LingBot-Depth-DC | robbyant/lingbot-depth-postrain-dc-vitl14 | robbyant/lingbot-depth-postrain-dc-vitl14 | Optimized for sparse depth completion |
@article{lingbot-depth2026,
title={Masked Depth Modeling for Spatial Perception},
author={Tan, Bin and Sun, Changjiang and Qin, Xiage and Adai, Hanat and Fu, Zelin and Zhou, Tianxiang and Zhang, Han and Xu, Yinghao and Zhu, Xing and Shen, Yujun and Xue, Nan},
journal={arXiv preprint arXiv:2601.17895},
year={2026}
}