You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SpatialBot is a VLM with spatial understanding and reasoning abilties, by precisely understanding depth maps and using them to do high-level tasks.

In this HF repo, we provide ckpts of SpatialBot-3B with LoRA, which is based on Phi-2 and SigLIP. It can perform well on general VLM tasks and spatial understanding benchmarks like SpatialBench.

You will also need to download pretrained CKPT.

Downloads last month: -

Dataset used to train RussRobin/SpatialBot-3B-LoRA

Paper for RussRobin/SpatialBot-3B-LoRA

SpatialBot: Precise Spatial Understanding with Vision Language Models

Paper • 2406.13642 • Published Jun 19, 2024 • 3

You need to agree to share your contact information to access this model

Paper:

GitHub repo:

SpatialBench, the benchmark:

Merged SpatialBot-3B:

Dataset used to train RussRobin/SpatialBot-3B-LoRA

Paper for RussRobin/SpatialBot-3B-LoRA