prithivMLmods commited on
Commit
2a98b7c
·
verified ·
1 Parent(s): 4353e3b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -13,6 +13,8 @@ tags:
13
  - vision-understanding
14
  ---
15
 
 
 
16
  # **Spatial-VU**
17
 
18
  > The **Spatial-VU** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, tailored for **Spatial Reasoning and Vision Understanding**. This variant is designed to generate highly detailed and descriptive captions across a broad range of visual categories, including images with complex, sensitive, or nuanced content—across varying aspect ratios and resolutions.
@@ -93,4 +95,4 @@ print(output_text)
93
  * May produce explicit, sensitive, or offensive descriptions depending on image content and prompts.
94
  * Not suitable for deployment in production systems requiring content filtering or moderation.
95
  * Can exhibit variability in caption tone or style depending on input prompt phrasing.
96
- * Accuracy for unfamiliar or synthetic visual styles may vary.
 
13
  - vision-understanding
14
  ---
15
 
16
+ ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/8x4nGmI8dCTi9m6z1BfCE.png)
17
+
18
  # **Spatial-VU**
19
 
20
  > The **Spatial-VU** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, tailored for **Spatial Reasoning and Vision Understanding**. This variant is designed to generate highly detailed and descriptive captions across a broad range of visual categories, including images with complex, sensitive, or nuanced content—across varying aspect ratios and resolutions.
 
95
  * May produce explicit, sensitive, or offensive descriptions depending on image content and prompts.
96
  * Not suitable for deployment in production systems requiring content filtering or moderation.
97
  * Can exhibit variability in caption tone or style depending on input prompt phrasing.
98
+ * Accuracy for unfamiliar or synthetic visual styles may vary.