Instructions to use tiantiaf/CPRT-Qwen3-VL-4B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tiantiaf/CPRT-Qwen3-VL-4B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="tiantiaf/CPRT-Qwen3-VL-4B-Instruct")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("tiantiaf/CPRT-Qwen3-VL-4B-Instruct", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use tiantiaf/CPRT-Qwen3-VL-4B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "tiantiaf/CPRT-Qwen3-VL-4B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tiantiaf/CPRT-Qwen3-VL-4B-Instruct",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/tiantiaf/CPRT-Qwen3-VL-4B-Instruct

SGLang

How to use tiantiaf/CPRT-Qwen3-VL-4B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "tiantiaf/CPRT-Qwen3-VL-4B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tiantiaf/CPRT-Qwen3-VL-4B-Instruct",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "tiantiaf/CPRT-Qwen3-VL-4B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tiantiaf/CPRT-Qwen3-VL-4B-Instruct",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use tiantiaf/CPRT-Qwen3-VL-4B-Instruct with Docker Model Runner:
```
docker model run hf.co/tiantiaf/CPRT-Qwen3-VL-4B-Instruct
```

CPRT-Qwen3-VL-4B-Instruct for Compositional Privacy Risk Evaluation

Model Description

This model includes the implementation of compositional privacy risk evaluation described in Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs (https://arxiv.org/pdf/2603.21573)

The prompt used in generating privacy scores (0-1)

instruction = """
    You are an expert privacy risk analyst. Analyze the image and assign a privacy score between 0.0 and 1.0.
    PRIVACY TAXONOMY (ordered by severity,):
    LEVEL 1 — Severe privacy risk (critical on its own):
    - Biometric data (identifiable faces, fingerprints, iris scans)
    - Government identifiers (passport, driver's license, SSN, national ID)
    LEVEL 2 — High privacy risk (sensitive alone, severe when combined):
    - Non-unique identifiers (names, usernames, workplace ID, date of birth)
    - Medical data (prescriptions, health records, medical imagery)
    - Financial data (credit cards, bank statements, account numbers)
    - Nudity or intimate content
    - Sensitive inferences (race/ethnicity, religion, politics, sexual orientation, disabilities, mental health)
    LEVEL 3 — Moderate privacy risk (becomes risky when combined):
    - Personal attributes (age, gender, physical appearance)
    - Location information (addresses, GPS, identifiable landmarks)
    - Activities & behaviors (habits, routines, consumption patterns)
    LEVEL 4 — Low privacy risk (context-dependent, minor concerns):
    - Property & assets (vehicles, homes without addresses)
    - Generic documents or digital artifacts
    - Metadata or background information
    - Crowds or distant unidentifiable individuals
    SCORING GUIDANCE:
    - Higher taxonomy levels should result in HIGHER scores
    - Level 1 elements indicate scores in the upper range (closer to 1.0)
    - Level 4 elements indicate scores in the lower range (closer to 0.0)
    - NO privacy elements detected  then  score should be 0.0
    - Multiple elements or combinations then consider cumulative risk
    - Use your judgment to assign the appropriate score based on severity
        If you detect ANY privacy-relevant element, the score MUST be > 0.0
    Output only valid JSON:
    {"privacy_score": <float between 0.0 and 1.0>}
"""

Load the Model

from peft import PeftModel, PeftConfig

adapter_id = "tiantiaf/CPRT-Qwen3-VL-4B-Instruct"
peft_config = PeftConfig.from_pretrained(adapter_id)

base_model = "Qwen/Qwen3-VL-4B-Instruct"
model = Qwen3VLForConditionalGeneration.from_pretrained(
    base_model,
    low_cpu_mem_usage=True,
    device_map=device_map,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_id)

tokenizer = AutoTokenizer.from_pretrained(base_model)
processor = AutoProcessor.from_pretrained(
    base_model,
    trust_remote_code=True
)

processor.tokenizer.pad_token = processor.tokenizer.eos_token
processor.image_processor.max_pixels = 2048 * 16 * 16  
processor.image_processor.min_pixels = 3136
tokenizer.pad_token = tokenizer.eos_token

terminators = [
    processor.tokenizer.convert_tokens_to_ids("<|im_end|>"),
    processor.tokenizer.convert_tokens_to_ids("<|endoftext|>")
]

Compositional Privacy Risk Evaluation

img = Image.open("YOUR PATH").convert('RGB')
messages = [
    {
        "role": "user", 
        "content": [
            {"type": "image"}, 
            {"type": "text", "text": instruction}
        ]
    }
]

prompt = processor.apply_chat_template(
    messages,
    add_generation_prompt=True
)

inputs = processor(
    text=prompt, 
    images=img, 
    return_tensors="pt"
)
inputs = inputs.to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=32,
    eos_token_id=terminators,
    pad_token_id=tokenizer.pad_token_id,
)

response = outputs[0][input_ids.shape[-1]:]
privacy_prediction = tokenizer.decode(response, skip_special_tokens=True)

If you have any questions, please contact: Tiantian Feng (tiantiaf@usc.edu)

Kindly cite our paper if you are using our model or find it useful in your work

@misc{tsaprazlis2026rethinkingvisualprivacycompositional,
      title={Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs}, 
      author={Efthymios Tsaprazlis and Tiantian Feng and Anil Ramakrishna and Sai Praneeth Karimireddy and Rahul Gupta and Shrikanth Narayanan},
      year={2026},
      eprint={2603.21573},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.21573}, 
}

Responsible use of the Model: the Model is released under Open RAIL license, and users should respect the privacy and consent of the data subjects, and adhere to the relevant laws and regulations in their jurisdictions in using our model.

❌ Out-of-Scope Use

Clinical or diagnostic applications
Surveillance
Privacy-invasive applications
No commercial use

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tiantiaf/CPRT-Qwen3-VL-4B-Instruct

Base model

Qwen/Qwen3-VL-4B-Instruct

Finetuned

(337)

this model

Dataset used to train tiantiaf/CPRT-Qwen3-VL-4B-Instruct

Paper for tiantiaf/CPRT-Qwen3-VL-4B-Instruct

Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs

Paper • 2603.21573 • Published Mar 23 • 1