tiantiaf
/

CPRT-Qwen3-VL-2B-Instruct

+---
+base_model:
+- Qwen/Qwen3-VL-2B-Instruct
+license: openrail
+pipeline_tag: image-text-to-text
+tags:
+- image
+- privacy
+library_name: transformers
+datasets:
+- timtsapras23/CPRT-Bench
+metrics:
+- mae
+---
+# CPRT-Qwen3-VL-2B-Instruct for Compositional Privacy Risk Evaluation
+# Model Description
+This model includes the implementation of compositional privacy risk evaluation described in Rethinking Visual Privacy: A Compositional Privacy Risk
+Framework for Severity Assessment with VLMs (https://arxiv.org/pdf/2603.21573)
+## The prompt used in generating privacy scores (0-1)
+```python
+instruction = """
+    You are an expert privacy risk analyst. Analyze the image and assign a privacy score between 0.0 and 1.0.
+    PRIVACY TAXONOMY (ordered by severity,):
+    LEVEL 1 — Severe privacy risk (critical on its own):
+    - Biometric data (identifiable faces, fingerprints, iris scans)
+    - Government identifiers (passport, driver's license, SSN, national ID)
+    LEVEL 2 — High privacy risk (sensitive alone, severe when combined):
+    - Non-unique identifiers (names, usernames, workplace ID, date of birth)
+    - Medical data (prescriptions, health records, medical imagery)
+    - Financial data (credit cards, bank statements, account numbers)
+    - Nudity or intimate content
+    - Sensitive inferences (race/ethnicity, religion, politics, sexual orientation, disabilities, mental health)
+    LEVEL 3 — Moderate privacy risk (becomes risky when combined):
+    - Personal attributes (age, gender, physical appearance)
+    - Location information (addresses, GPS, identifiable landmarks)
+    - Activities & behaviors (habits, routines, consumption patterns)
+    LEVEL 4 — Low privacy risk (context-dependent, minor concerns):
+    - Property & assets (vehicles, homes without addresses)
+    - Generic documents or digital artifacts
+    - Metadata or background information
+    - Crowds or distant unidentifiable individuals
+    SCORING GUIDANCE:
+    - Higher taxonomy levels should result in HIGHER scores
+    - Level 1 elements indicate scores in the upper range (closer to 1.0)
+    - Level 4 elements indicate scores in the lower range (closer to 0.0)
+    - NO privacy elements detected  then  score should be 0.0
+    - Multiple elements or combinations then consider cumulative risk
+    - Use your judgment to assign the appropriate score based on severity
+        If you detect ANY privacy-relevant element, the score MUST be > 0.0
+    Output only valid JSON:
+    {"privacy_score": <float between 0.0 and 1.0>}
+"""
+```
+## Load the Model
+```python
+from peft import PeftModel, PeftConfig
+adapter_id = "tiantiaf/CPRT-Qwen3-VL-2B-Instruct"
+peft_config = PeftConfig.from_pretrained(adapter_id)
+base_model = "Qwen/Qwen3-VL-2B-Instruct"
+model = Qwen3VLForConditionalGeneration.from_pretrained(
+    base_model,
+    low_cpu_mem_usage=True,
+    device_map=device_map,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+)
+model = PeftModel.from_pretrained(model, adapter_id)
+tokenizer = AutoTokenizer.from_pretrained(base_model)
+processor = AutoProcessor.from_pretrained(
+    base_model,
+    trust_remote_code=True
+)
+processor.tokenizer.pad_token = processor.tokenizer.eos_token
+processor.image_processor.max_pixels = 2048 * 16 * 16
+processor.image_processor.min_pixels = 3136
+tokenizer.pad_token = tokenizer.eos_token
+terminators = [
+    processor.tokenizer.convert_tokens_to_ids("<|im_end|>"),
+    processor.tokenizer.convert_tokens_to_ids("<|endoftext|>")
+]
+```
+## Compositional Privacy Risk Evaluation
+```python
+img = Image.open("YOUR PATH").convert('RGB')
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"type": "image"},
+            {"type": "text", "text": instruction}
+        ]
+    }
+]
+prompt = processor.apply_chat_template(
+    messages,
+    add_generation_prompt=True
+)
+inputs = processor(
+    text=prompt,
+    images=img,
+    return_tensors="pt"
+)
+inputs = inputs.to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=32,
+    eos_token_id=terminators,
+    pad_token_id=tokenizer.pad_token_id,
+)
+response = outputs[0][input_ids.shape[-1]:]
+privacy_prediction = tokenizer.decode(response, skip_special_tokens=True)
+```
+## If you have any questions, please contact: Tiantian Feng (tiantiaf@usc.edu)
+## Kindly cite our paper if you are using our model or find it useful in your work
+```
+@misc{tsaprazlis2026rethinkingvisualprivacycompositional,
+      title={Rethinking Visual Privacy: A Compositional Privacy Risk Framework for Severity Assessment with VLMs},
+      author={Efthymios Tsaprazlis and Tiantian Feng and Anil Ramakrishna and Sai Praneeth Karimireddy and Rahul Gupta and Shrikanth Narayanan},
+      year={2026},
+      eprint={2603.21573},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2603.21573},
+}
+```
+Responsible use of the Model: the Model is released under Open RAIL license, and users should respect the privacy and consent of the data subjects, and adhere to the relevant laws and regulations in their jurisdictions in using our model.
+❌ **Out-of-Scope Use**
+- Clinical or diagnostic applications
+- Surveillance
+- Privacy-invasive applications
+- No commercial use