Instructions to use prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2")
model = AutoModelForMultimodalLM.from_pretrained("prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2", device_map="auto")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2

SGLang

How to use prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2 with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Qwen3-VL-4B-Instruct-c_abliterated-v2

Qwen3-VL-4B-Instruct-c_abliterated-v2 is an advanced evolution of the Qwen3-VL-4B-Instruct architecture. This v2 release focuses on Continual Abliteration, a refined process designed to systematically remove refusal mechanisms through repeated training iterations. The result is a model optimized for high-fidelity reasoning and captioning across even the most complex, nuanced, or restrictive visual contexts.

Key Highlights

Continual Abliteration (c_abliterated): Specifically trained via repeated iterations to target and neutralize refusal vectors, ensuring the model provides direct answers to prompts that standard models might bypass.
High-Fidelity Reasoning: Goes beyond simple tagging to provide deep reasoning and context-aware descriptions for artistic, technical, and abstract imagery.
Unrestricted Multimodal Analysis: Optimized for research, red-teaming, and datasets where unfiltered visual interpretation is necessary for thorough analysis.
Flexible Aspect Ratios: Maintains spatial awareness and accuracy across wide, tall, square, and non-standard image dimensions.
Enhanced Instruction Following: Leverages the base Qwen3-VL-4B power to handle complex, multi-step prompts involving visual data.

Base Model Signatures:

This model has been re-sharded and optimized for the latest Transformers version from the base model: https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-4B-Instruct-abliterated.

Quick Start with Transformers

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch

# Load the v2 c_abliterated model
model = Qwen3VLForConditionalGeneration.from_pretrained(
    "prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2", 
    torch_dtype="auto", 
    device_map="auto"
)

processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VL-4B-Instruct-c_abliterated-v2")

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Provide a detailed caption and reasoning for this image."},
        ],
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
)
inputs = inputs.to("cuda")

generated_ids = model.generate(**inputs, max_new_tokens=256)
generated_ids_trimmed = [
    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)

Intended Use

Refusal Research: Evaluating how LLMs behave when standard guardrails are removed through iterative training.
Complex Dataset Captioning: Generating descriptive metadata for medical, forensic, or controversial historical archives.
Red-Teaming: Assisting security researchers in testing the limits of multimodal safety filters.
Creative Freedom: Enabling artists and writers to generate descriptions for "edge-case" visual concepts without synthetic interference.

Limitations & Ethics

Warning: As a c_abliterated model, this version will not refuse prompts based on typical safety guidelines.

Explicit Content: The model may generate graphic, explicit, or offensive text based on image input.
Non-Production Use: This model is intended for research and controlled environments, not for general-purpose public applications.
Factual Accuracy: While reasoning is enhanced, the model can still hallucinate or misinterpret highly abstract or synthetic visuals.