JonnyYu828
/

DepthVLM-4B

Depth Estimation

image-text-to-text

vision-language-model

Model card Files Files and versions

JonnyYu828 commited on 15 days ago

Commit

1b7c7cf

·

verified ·

1 Parent(s): b3b0bda

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -49,7 +49,7 @@ DepthVLM serves as **a unified foundation model for both low-level dense geometr
 By attaching a lightweight depth head to the LLM backbone and adopting a two-stage supervision paradigm, DepthVLM transforms a single VLM into a native dense geometry predictor, while preserving its multimodal capabilities and enhancing its spatial reasoning.
-### Key Characteristics
 - **Native dense metric depth estimation in VLMs**: Directly predicts geometry within the VLM framework.

 By attaching a lightweight depth head to the LLM backbone and adopting a two-stage supervision paradigm, DepthVLM transforms a single VLM into a native dense geometry predictor, while preserving its multimodal capabilities and enhancing its spatial reasoning.
+## 🧠 Key Characteristics
 - **Native dense metric depth estimation in VLMs**: Directly predicts geometry within the VLM framework.