prithivMLmods commited on
Commit
1992be9
·
verified ·
1 Parent(s): 53237a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -18,6 +18,8 @@ pipeline_tag: image-text-to-text
18
  library_name: transformers
19
  ---
20
 
 
 
21
  # **Qwen3-VisionCaption-2B-Thinking**
22
 
23
  > **Qwen3-VisionCaption-2B-Thinking** is an abliterated v1.0 variant built upon **Qwen3-VL-2B-Instruct-abliterated-v1**, which originates from the **Qwen3-VL-2B-Instruct** architecture. It is specifically optimized for seamless, high precision image captioning and uncensored visual analysis. The model is engineered for robust caption generation, deep reasoning, and unrestricted descriptive understanding across diverse visual and multimodal contexts.
@@ -107,4 +109,4 @@ print(output_text)
107
  * May produce explicit, sensitive, or offensive descriptions depending on visual content.
108
  * Not recommended for production environments requiring strict safety controls.
109
  * Performance may vary for heavily abstract or synthetic content.
110
- * Output tone depends on prompt phrasing and detail level settings.
 
18
  library_name: transformers
19
  ---
20
 
21
+ ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/9IgGRwce_lx4HsDq8fGvW.png)
22
+
23
  # **Qwen3-VisionCaption-2B-Thinking**
24
 
25
  > **Qwen3-VisionCaption-2B-Thinking** is an abliterated v1.0 variant built upon **Qwen3-VL-2B-Instruct-abliterated-v1**, which originates from the **Qwen3-VL-2B-Instruct** architecture. It is specifically optimized for seamless, high precision image captioning and uncensored visual analysis. The model is engineered for robust caption generation, deep reasoning, and unrestricted descriptive understanding across diverse visual and multimodal contexts.
 
109
  * May produce explicit, sensitive, or offensive descriptions depending on visual content.
110
  * Not recommended for production environments requiring strict safety controls.
111
  * Performance may vary for heavily abstract or synthetic content.
112
+ * Output tone depends on prompt phrasing and detail level settings.