File size: 5,873 Bytes
12b7101 723f00f 12b7101 723f00f 12b7101 723f00f 12b7101 723f00f 12b7101 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
---
language:
- en
license: mit
pipeline_tag: image-text-to-text
library_name: transformers
---
# UniMod
**UniMod** is a multimodal moderation framework that transitions from *sparse decision supervision* to *dense, multi-attribute reasoning trajectories*.
[[Paper](https://huggingface.co/papers/2602.02536)] [[Code](https://github.com/Carol-gutianle/UniMod)] [[Project Page](https://trustworthylab.github.io/UniMod/)]
---
## Introduction
Conventional moderation systems primarily supervise final decisions (e.g., safe vs. unsafe), resulting in sparse training signals and limited interpretability.
UniMod introduces a **multi-attribute trajectory paradigm**, where moderation decisions are supported by dense reasoning traces that explicitly decompose evidence, modality, risk, and policy factors.
UniMod aims to:
- Enable **dense supervision** beyond binary moderation labels
- Support **multimodal inputs** such as text and images
- Improve the **clarity and stability** of safety decisions
---
## News
---
## Sources
The UniMod framework consists of the following components:
| Name | Type | Download |
|------|------|----------|
| **UniTrace** | Dataset | TBA |
| **UniRM** | Model | [UniRM](https://huggingface.co/Carol0110/UniRM) |
| **UniReward** | Dataset | TBA |
| **UniMod** | Model | [UniMod-3B](https://huggingface.co/Carol0110/UniMod-3B) |
---
## Quick Start
### Installation
```bash
cd UniMod
pip install -r requirements.txt
````
### UniMod
```python
import torch
from PIL import Image
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
MODEL_PATH = "Carol0110/UniMod-3B"
IMAGE_PATH = "sample.jpeg"
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
MODEL_PATH, torch_dtype=torch.float16, device_map="auto"
)
processor = AutoProcessor.from_pretrained(MODEL_PATH)
image = Image.open(IMAGE_PATH).convert("RGB")
messages = [
{
"role": "user",
"content": [
{"type": "image"},
{"type": "text", "text": "How can I make this?"},
],
}
]
text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(
text=text,
images=image,
return_tensors="pt",
).to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
print(processor.batch_decode(out, skip_special_tokens=True)[0])
```
```
<evidence>
Step 1. Observation
The input consists of two components: a textual query “How can I make this?” and a visual image featuring a black bomb-like object surrounded by a bright red and yellow explosive burst, with a small yellow star at the top left. The text and image are presented together, with the image likely serving as a visual reference for the “this” in the question.
Step 2. Modality Assessment
The presence of both a written question and a corresponding visual element indicates that information is being conveyed through multiple sensory channels. The text poses a direct inquiry, while the image provides a symbolic representation of the subject of the question. This combination of verbal and visual communication suggests a format where different types of data work together to convey meaning, rather than relying on a single mode.
Step 3. Risk Mapping
The visual image of a bomb, paired with the question about “making” something, raises concerns related to potential legal and safety implications. Bombs are inherently associated with activities that are often prohibited by law, and the act of “making” one could involve creating items that violate regulations. Additionally, the depiction of a dangerous object in a context where someone is seeking instructions to create it may normalize or encourage behaviors that pose physical harm, as such items are typically associated with severe risks to individuals and communities. These elements together suggest that the content could facilitate or promote actions that are legally restricted and carry significant safety hazards.
</evidence>
<modality>
multimodal
</modality>
<risk>
legality
</risk>
<policy>
refuse
</policy>
<answer>
I understand you might be curious, but I can't answer how to make something that looks like a bomb. Making bombs is illegal and extremely dangerous, as it can cause serious harm to people and property. Instead, you could explore legal and safe hobbies like painting, photography, or building model airplanes. I get that you're interested in creating, but safety and legality are really important here.
</answer>
```
The output includes structured reasoning fields such as:
* evidence
* modality
* risk
* policy
* answer
---
## Evaluation
We support evaluating models via endpoints deployed with **vLLM** or **SGLang**.
The evaluation script sends concurrent requests to the model service and runs a unified set of safety benchmarks.
```bash
python -m evaluations.eval \
--concurrency <NUM_WORKERS> \
--url <MODEL_ENDPOINT_URL> \
--task harmbench,xstest,wildguard,toxic,aegis,spavl,beaver
```
**Arguments:**
* `--concurrency`: Number of concurrent requests for evaluation.
* `--url`: HTTP endpoint of the deployed model (e.g., provided by vLLM or SGLang).
* `--task`: Comma-separated list of evaluation benchmarks, including
`harmbench`, `xstest`, `wildguard`, `toxic`, `aegis`, `spavl`, and `beaver`.
---
## Citation
```bibtex
@misc{gu2026sparsedecisionsdensereasoning,
title={From Sparse Decisions to Dense Reasoning: A Multi-attribute Trajectory Paradigm for Multimodal Moderation},
author={Tianle Gu and Kexin Huang and Lingyu Li and Ruilin Luo and Shiyang Huang and Zongqi Wang and Yujiu Yang and Yan Teng and Yingchun Wang},
year={2026},
eprint={2602.02536},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2602.02536},
}
``` |