prithivMLmods
/

Spatial-VU

Image-Text-to-Text

text-generation-inference

spatial-reasoning

vision-understanding

Model card Files Files and versions

prithivMLmods commited on Oct 19, 2025

Commit

2a98b7c

·

verified ·

1 Parent(s): 4353e3b

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -13,6 +13,8 @@ tags:
 - vision-understanding
 ---
 # **Spatial-VU**
 > The **Spatial-VU** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, tailored for **Spatial Reasoning and Vision Understanding**. This variant is designed to generate highly detailed and descriptive captions across a broad range of visual categories, including images with complex, sensitive, or nuanced content—across varying aspect ratios and resolutions.
@@ -93,4 +95,4 @@ print(output_text)
 * May produce explicit, sensitive, or offensive descriptions depending on image content and prompts.
 * Not suitable for deployment in production systems requiring content filtering or moderation.
 * Can exhibit variability in caption tone or style depending on input prompt phrasing.
-* Accuracy for unfamiliar or synthetic visual styles may vary.

 - vision-understanding
 ---
+![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/8x4nGmI8dCTi9m6z1BfCE.png)
 # **Spatial-VU**
 > The **Spatial-VU** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, tailored for **Spatial Reasoning and Vision Understanding**. This variant is designed to generate highly detailed and descriptive captions across a broad range of visual categories, including images with complex, sensitive, or nuanced content—across varying aspect ratios and resolutions.
 * May produce explicit, sensitive, or offensive descriptions depending on image content and prompts.
 * Not suitable for deployment in production systems requiring content filtering or moderation.
 * Can exhibit variability in caption tone or style depending on input prompt phrasing.
+* Accuracy for unfamiliar or synthetic visual styles may vary.