CPRT-Qwen3-VL-4B-Instruct for Compositional Privacy Risk Evaluation

Model Description

This model includes the implementation of compositional privacy risk evaluation described in Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs (https://arxiv.org/pdf/2603.21573)

The prompt used in generating privacy scores (0-1)

instruction = """
    You are an expert privacy risk analyst. Analyze the image and assign a privacy score between 0.0 and 1.0.
    PRIVACY TAXONOMY (ordered by severity,):
    LEVEL 1 — Severe privacy risk (critical on its own):
    - Biometric data (identifiable faces, fingerprints, iris scans)
    - Government identifiers (passport, driver's license, SSN, national ID)
    LEVEL 2 — High privacy risk (sensitive alone, severe when combined):
    - Non-unique identifiers (names, usernames, workplace ID, date of birth)
    - Medical data (prescriptions, health records, medical imagery)
    - Financial data (credit cards, bank statements, account numbers)
    - Nudity or intimate content
    - Sensitive inferences (race/ethnicity, religion, politics, sexual orientation, disabilities, mental health)
    LEVEL 3 — Moderate privacy risk (becomes risky when combined):
    - Personal attributes (age, gender, physical appearance)
    - Location information (addresses, GPS, identifiable landmarks)
    - Activities & behaviors (habits, routines, consumption patterns)
    LEVEL 4 — Low privacy risk (context-dependent, minor concerns):
    - Property & assets (vehicles, homes without addresses)
    - Generic documents or digital artifacts
    - Metadata or background information
    - Crowds or distant unidentifiable individuals
    SCORING GUIDANCE:
    - Higher taxonomy levels should result in HIGHER scores
    - Level 1 elements indicate scores in the upper range (closer to 1.0)
    - Level 4 elements indicate scores in the lower range (closer to 0.0)
    - NO privacy elements detected  then  score should be 0.0
    - Multiple elements or combinations then consider cumulative risk
    - Use your judgment to assign the appropriate score based on severity
        If you detect ANY privacy-relevant element, the score MUST be > 0.0
    Output only valid JSON:
    {"privacy_score": <float between 0.0 and 1.0>}
"""

Load the Model

from peft import PeftModel, PeftConfig

adapter_id = "tiantiaf/CPRT-Qwen3-VL-4B-Instruct"
peft_config = PeftConfig.from_pretrained(adapter_id)

base_model = "Qwen/Qwen3-VL-4B-Instruct"
model = Qwen3VLForConditionalGeneration.from_pretrained(
    base_model,
    low_cpu_mem_usage=True,
    device_map=device_map,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_id)

tokenizer = AutoTokenizer.from_pretrained(base_model)
processor = AutoProcessor.from_pretrained(
    base_model,
    trust_remote_code=True
)

processor.tokenizer.pad_token = processor.tokenizer.eos_token
processor.image_processor.max_pixels = 2048 * 16 * 16  
processor.image_processor.min_pixels = 3136
tokenizer.pad_token = tokenizer.eos_token

terminators = [
    processor.tokenizer.convert_tokens_to_ids("<|im_end|>"),
    processor.tokenizer.convert_tokens_to_ids("<|endoftext|>")
]

Compositional Privacy Risk Evaluation

img = Image.open("YOUR PATH").convert('RGB')
messages = [
    {
        "role": "user", 
        "content": [
            {"type": "image"}, 
            {"type": "text", "text": instruction}
        ]
    }
]

prompt = processor.apply_chat_template(
    messages,
    add_generation_prompt=True
)

inputs = processor(
    text=prompt, 
    images=img, 
    return_tensors="pt"
)
inputs = inputs.to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=32,
    eos_token_id=terminators,
    pad_token_id=tokenizer.pad_token_id,
)

response = outputs[0][input_ids.shape[-1]:]
privacy_prediction = tokenizer.decode(response, skip_special_tokens=True)

If you have any questions, please contact: Tiantian Feng (tiantiaf@usc.edu)

Kindly cite our paper if you are using our model or find it useful in your work

@misc{tsaprazlis2026rethinkingvisualprivacycompositional,
      title={Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs}, 
      author={Efthymios Tsaprazlis and Tiantian Feng and Anil Ramakrishna and Sai Praneeth Karimireddy and Rahul Gupta and Shrikanth Narayanan},
      year={2026},
      eprint={2603.21573},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.21573}, 
}

Responsible use of the Model: the Model is released under Open RAIL license, and users should respect the privacy and consent of the data subjects, and adhere to the relevant laws and regulations in their jurisdictions in using our model.

Out-of-Scope Use

  • Clinical or diagnostic applications
  • Surveillance
  • Privacy-invasive applications
  • No commercial use
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tiantiaf/CPRT-Qwen3-VL-4B-Instruct

Finetuned
(227)
this model

Dataset used to train tiantiaf/CPRT-Qwen3-VL-4B-Instruct

Paper for tiantiaf/CPRT-Qwen3-VL-4B-Instruct