Unlocking Dense Metric Depth Estimation in VLMs
Paper • 2605.15876 • Published • 7
Update 2026-05-18 (v1.0): Initial release
DepthVLM serves as a unified foundation model for both low-level dense geometry prediction and high-level multimodal understanding, while achieving substantially faster inference compared with existing VLM-based approaches such as DepthLM and Youtu-VL.
Unlocking Dense Metric Depth Estimation in VLMs
Please refer to the official repository for detailed instructions on:
Repository: https://github.com/hanxunyu/DepthVLM
If you find this work useful, please cite:
@article{yu2026unlocking,
title={Unlocking Dense Metric Depth Estimation in VLMs},
author={Hanxun Yu and Xuan Qu and Yuxin Wang and Jianke Zhu and Lei Ke},
journal={arXiv preprint arXiv:2605.15876},
year={2026}
}
Base model
Qwen/Qwen3-VL-4B-Instruct