rayruiyang commited on
Commit
b18ec54
·
verified ·
1 Parent(s): d69143e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -4,7 +4,7 @@ library_name: transformers
4
  pipeline_tag: image-text-to-text
5
  ---
6
 
7
- # VST-7B-RL
8
 
9
  <p align="left">
10
  <a href="https://yangr116.github.io/vst_project/">
@@ -27,8 +27,9 @@ pipeline_tag: image-text-to-text
27
  </a>
28
  </p>
29
 
 
30
 
31
- We introduce **Visual Spatial Tuning (VST)**, a comprehensive framework designed to cultivate Vision-Language Models (VLMs) with human-like visuospatial abilities—from spatial perception to advanced reasoning.
32
 
33
 
34
  ## 💡 Key Highlights
 
4
  pipeline_tag: image-text-to-text
5
  ---
6
 
7
+ # Visual Spatial Tuning: VST-7B-RL
8
 
9
  <p align="left">
10
  <a href="https://yangr116.github.io/vst_project/">
 
27
  </a>
28
  </p>
29
 
30
+ This model is described in the paper [Visual Spatial Tuning](https://huggingface.co/papers/2511.05491).
31
 
32
+ TL;DR: VST is a comprehensive framework designed to cultivate Vision-Language Models (VLMs) with human-like visuospatial abilities—from spatial perception to advanced reasoning.
33
 
34
 
35
  ## 💡 Key Highlights