wangkanai
/

qwen3-vl-2b-instruct

Image-Text-to-Text

vision-language

Model card Files Files and versions

wangkanai commited on Oct 30, 2025

Commit

082f9f6

·

verified ·

1 Parent(s): e804216

Upload folder using huggingface_hub

Files changed (1) hide show

README.md +10 -10

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ tags:
   - uncensored
 ---
-<!-- README Version: v1.0 -->
 # Qwen3-VL-2B-Instruct (Abliterated)
@@ -38,15 +38,15 @@ This model can perform tasks such as:
 ```
 qwen3-vl-2b-instruct/
-├── qwen3-vl-2b-instruct-abliterated.gguf          (3.21 GB)
-└── qwen3-vl-2b-instruct-abliterated.safetensors   (3.96 GB)
 ```
-**Total Repository Size**: ~7.17 GB
 ### File Descriptions
-- **qwen3-vl-2b-instruct-abliterated.gguf** - Quantized GGUF format for efficient inference with llama.cpp and compatible frameworks
 - **qwen3-vl-2b-instruct-abliterated.safetensors** - Full-precision SafeTensors format for use with transformers library
 ## Hardware Requirements
@@ -103,7 +103,7 @@ print(response)
 ```bash
 # Run with llama.cpp
 ./llama.cpp \
-  --model "E:\huggingface\qwen3-vl-2b-instruct\qwen3-vl-2b-instruct-abliterated.gguf" \
   --image example.jpg \
   --prompt "What do you see in this image?" \
   --n-predict 256 \
@@ -192,7 +192,7 @@ print(generated_text)
 1. **Use Quantized GGUF for Speed**:
    - GGUF format provides faster inference
-   - Lower memory usage (3.21 GB vs 3.96 GB)
    - Minimal quality loss for most tasks
 2. **GPU Acceleration**:
@@ -306,6 +306,6 @@ This abliterated model has had safety mechanisms removed and may generate conten
 ---
-**Model Version**: v1.0
-**Last Updated**: 2025-10-28
-**Format Versions**: SafeTensors (3.96 GB), GGUF (3.21 GB)

   - uncensored
 ---
+<!-- README Version: v1.1 -->
 # Qwen3-VL-2B-Instruct (Abliterated)
 ```
 qwen3-vl-2b-instruct/
+├── qwen3-vl-2b-instruct-abliterated-f16.gguf      (3.3 GB)
+└── qwen3-vl-2b-instruct-abliterated.safetensors   (4.0 GB)
 ```
+**Total Repository Size**: ~7.3 GB
 ### File Descriptions
+- **qwen3-vl-2b-instruct-abliterated-f16.gguf** - FP16 quantized GGUF format for efficient inference with llama.cpp and compatible frameworks
 - **qwen3-vl-2b-instruct-abliterated.safetensors** - Full-precision SafeTensors format for use with transformers library
 ## Hardware Requirements
 ```bash
 # Run with llama.cpp
 ./llama.cpp \
+  --model "E:\huggingface\qwen3-vl-2b-instruct\qwen3-vl-2b-instruct-abliterated-f16.gguf" \
   --image example.jpg \
   --prompt "What do you see in this image?" \
   --n-predict 256 \
 1. **Use Quantized GGUF for Speed**:
    - GGUF format provides faster inference
+   - Lower memory usage (3.3 GB vs 4.0 GB)
    - Minimal quality loss for most tasks
 2. **GPU Acceleration**:
 ---
+**Model Version**: v1.1
+**Last Updated**: 2025-10-30
+**Format Versions**: SafeTensors (4.0 GB), GGUF FP16 (3.3 GB)