Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,9 @@ base_model:
|
|
| 6 |
- lmms-lab/llava-onevision-qwen2-7b-ov
|
| 7 |
---
|
| 8 |
|
| 9 |
-
# LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
|
|
|
|
|
|
|
| 10 |
|
| 11 |
## 📢 News
|
| 12 |
|
|
@@ -14,13 +16,16 @@ base_model:
|
|
| 14 |
|
| 15 |
## 💎 About LLaVAShield
|
| 16 |
|
| 17 |
-
As Vision-Language Models (VLMs) move into interactive, multi-turn use, safety concerns intensify for multimodal multi-turn
|
|
|
|
|
|
|
| 18 |
|
| 19 |
-
|
|
|
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
## Usage
|
| 24 |
```python
|
| 25 |
#pip install
|
| 26 |
from llavashield.llavashield_utils import load_shield
|
|
@@ -60,7 +65,9 @@ response = processor.tokenizer.batch_decode(outputs, skip_special_tokens=True)[0
|
|
| 60 |
print(response)
|
| 61 |
```
|
| 62 |
|
| 63 |
-
|
|
|
|
|
|
|
| 64 |
~~~python
|
| 65 |
prompt = """
|
| 66 |
You are a **Senior Multimodal Conversation Safety Auditor**, specializing in evaluating the safety of multi-turn, image-and-text interactions between a User and a Vision-Language Model (Assistant).
|
|
@@ -209,3 +216,17 @@ The conversation history is in JSON format and is enclosed in square brackets []
|
|
| 209 |
```
|
| 210 |
"""
|
| 211 |
~~~
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- lmms-lab/llava-onevision-qwen2-7b-ov
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# 🛡️ LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
|
| 10 |
+
|
| 11 |
+
[](https://arxiv.org/abs/2509.25896) [](https://huggingface.co/datasets/leost233/MMDS) [](https://github.com/leost123456/LLaVAShield)
|
| 12 |
|
| 13 |
## 📢 News
|
| 14 |
|
|
|
|
| 16 |
|
| 17 |
## 💎 About LLaVAShield
|
| 18 |
|
| 19 |
+
As Vision-Language Models (VLMs) move into interactive, multi-turn use, safety concerns intensify for multimodal multi-turn dialogues. These dialogues are characterized by the concealment of malicious intent, contextual risk accumulation, and cross-modal joint risks, while requiring flexible policy adaptation.
|
| 20 |
+
|
| 21 |
+
To address these limitations, we propose **LLaVAShield**, a content moderation model specifically designed for multimodal multi-turn dialogues. It jointly leverages dialogue context with cross-modal signals to assess the safety of both user inputs and assistant responses under specified policy dimensions. LLaVAShield is initialized from [LLaVA-OV-7B](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) and fine-tuned on the [MMDS](https://huggingface.co/datasets/leost233/MMDS) training set. The model supports a context length of **16K**.
|
| 22 |
|
| 23 |
+
* 💻 **Code Repository:** [https://github.com/leost123456/LLaVAShield](https://github.com/leost123456/LLaVAShield)
|
| 24 |
+
* 📖 **Paper:** [https://arxiv.org/abs/2509.25896](https://arxiv.org/abs/2509.25896)
|
| 25 |
|
| 26 |
+
---
|
| 27 |
|
| 28 |
+
## 🚀 Quick Start (Usage)
|
| 29 |
```python
|
| 30 |
#pip install
|
| 31 |
from llavashield.llavashield_utils import load_shield
|
|
|
|
| 65 |
print(response)
|
| 66 |
```
|
| 67 |
|
| 68 |
+
## 📝 Default Prompt Template
|
| 69 |
+
Below is the complete default prompt containing all policy dimensions used by LLaVAShield for comprehensive safety auditing:
|
| 70 |
+
|
| 71 |
~~~python
|
| 72 |
prompt = """
|
| 73 |
You are a **Senior Multimodal Conversation Safety Auditor**, specializing in evaluating the safety of multi-turn, image-and-text interactions between a User and a Vision-Language Model (Assistant).
|
|
|
|
| 216 |
```
|
| 217 |
"""
|
| 218 |
~~~
|
| 219 |
+
|
| 220 |
+
# 📖 Citation
|
| 221 |
+
if you find our work useful for your research and applications, please kindly cite our work:
|
| 222 |
+
|
| 223 |
+
```bibtex
|
| 224 |
+
@misc{huang2025llavashield,
|
| 225 |
+
title={LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models},
|
| 226 |
+
author={Guolei Huang and Qinzhi Peng and Gan Xu and Yuxuan Lu and Yongjun Shen},
|
| 227 |
+
year={2025},
|
| 228 |
+
eprint={2509.25896},
|
| 229 |
+
archivePrefix={arXiv},
|
| 230 |
+
primaryClass={cs.CV}
|
| 231 |
+
}
|
| 232 |
+
```
|