Buckets:

joedorfman
/

DepthVLM-Bench-bucket

Files

xet

joedorfman/DepthVLM-Bench-bucket / README.md

joedorfman

11 days ago

preview code

download

raw

1.26 kB

metadata

license: apache-2.0
task_categories:
  - depth-estimation
tags:
  - depth-estimation
  - 3d-vision
  - multimodal
  - metric-depth
paper:
  - arxiv: 2605.15876

DepthVLM-Bench

DepthVLM-Bench is a unified indoor-outdoor metric depth estimation benchmark designed for vision-language models (VLMs). The benchmark provides diverse indoor and outdoor scenes with metric depth annotations in a unified VLM-compatible format, enabling large multimodal models to jointly learn dense geometry prediction and multimodal understanding.

Features

Unified indoor and outdoor metric depth estimation
VLM-compatible data format
Dense depth supervision for multimodal foundation models
Designed for scalable multimodal training

Paper

Unlocking Dense Metric Depth Estimation in VLMs

Usage

Please refer to the official repository for:

Data preprocessing
Evaluation scripts
Visualization examples

Repository: https://github.com/hanxunyu/DepthVLM

Citation

@article{yu2026unlocking,
  title={Unlocking Dense Metric Depth Estimation in VLMs},
  author={Hanxun Yu and Xuan Qu and Yuxin Wang and Jianke Zhu and Lei Ke},
  journal={arXiv preprint arXiv:2605.15876},
  year={2026}
}

Xet Storage Details

Size:: 1.26 kB
Xet hash:: b1fea33c0abadf4d55e3e86365e807d6482b2ea5e0aa80c952e86c8e6f6a6064

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.