Buckets:

joedorfman
/

DepthVLM-Bench-bucket

11 days ago

1.26 kB

	---
	license: apache-2.0

	task_categories:
	- depth-estimation

	tags:
	- depth-estimation
	- 3d-vision
	- multimodal
	- metric-depth

	paper:
	- arxiv: 2605.15876
	---

	# DepthVLM-Bench

	DepthVLM-Bench is a unified indoor-outdoor metric depth estimation benchmark designed for vision-language models (VLMs). The benchmark provides diverse indoor and outdoor scenes with metric depth annotations in a unified VLM-compatible format, enabling large multimodal models to jointly learn dense geometry prediction and multimodal understanding.

	## Features

	- Unified indoor and outdoor metric depth estimation
	- VLM-compatible data format
	- Dense depth supervision for multimodal foundation models
	- Designed for scalable multimodal training

	## Paper

	[Unlocking Dense Metric Depth Estimation in VLMs](https://arxiv.org/abs/2605.15876)

	## Usage

	Please refer to the official repository for:

	- Data preprocessing
	- Evaluation scripts
	- Visualization examples

	Repository: https://github.com/hanxunyu/DepthVLM

	## Citation

	```bibtex id="83r6sk"
	@article{yu2026unlocking,
	title={Unlocking Dense Metric Depth Estimation in VLMs},
	author={Hanxun Yu and Xuan Qu and Yuxin Wang and Jianke Zhu and Lei Ke},
	journal={arXiv preprint arXiv:2605.15876},
	year={2026}
	}

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.