leost233 commited on
Commit
7ba5816
·
verified ·
1 Parent(s): bd15114

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -6
README.md CHANGED
@@ -6,7 +6,9 @@ base_model:
6
  - lmms-lab/llava-onevision-qwen2-7b-ov
7
  ---
8
 
9
- # LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
 
 
10
 
11
  ## 📢 News
12
 
@@ -14,13 +16,16 @@ base_model:
14
 
15
  ## 💎 About LLaVAShield
16
 
17
- As Vision-Language Models (VLMs) move into interactive, multi-turn use, safety concerns intensify for multimodal multi-turn dialogue. These dialogues are characterized by concealment of malicious intent, contextual risk accumulation, and cross-modal joint risk,并支持灵活的政策适配。
 
 
18
 
19
- To address these limitations, we propose LLaVAShield, a content moderation model for multimodal multi-turn dialogues. It jointly leverages dialogue context with cross-modal signals to assess the safety of both user inputs and assistant responses under specified policy dimensions. LLaVAShield is initialized from [LLaVA-OV-7B](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) and fine-tuned on the [MMDS](https://huggingface.co/datasets/leost233/MMDS) training set. 上下文长度为16K.
 
20
 
21
- code网站: paper网站:
22
 
23
- ## Usage
24
  ```python
25
  #pip install
26
  from llavashield.llavashield_utils import load_shield
@@ -60,7 +65,9 @@ response = processor.tokenizer.batch_decode(outputs, skip_special_tokens=True)[0
60
  print(response)
61
  ```
62
 
63
- 默认的完整prompt如下所示:
 
 
64
  ~~~python
65
  prompt = """
66
  You are a **Senior Multimodal Conversation Safety Auditor**, specializing in evaluating the safety of multi-turn, image-and-text interactions between a User and a Vision-Language Model (Assistant).
@@ -209,3 +216,17 @@ The conversation history is in JSON format and is enclosed in square brackets []
209
  ```
210
  """
211
  ~~~
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - lmms-lab/llava-onevision-qwen2-7b-ov
7
  ---
8
 
9
+ # 🛡️ LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
10
+
11
+ [![Paper](https://img.shields.io/badge/Paper-arXiv-red)](https://arxiv.org/abs/2509.25896) [![Dataset](https://img.shields.io/badge/Dataset-MMDS-blue)](https://huggingface.co/datasets/leost233/MMDS) [![Code](https://img.shields.io/badge/Code-GitHub-black)](https://github.com/leost123456/LLaVAShield)
12
 
13
  ## 📢 News
14
 
 
16
 
17
  ## 💎 About LLaVAShield
18
 
19
+ As Vision-Language Models (VLMs) move into interactive, multi-turn use, safety concerns intensify for multimodal multi-turn dialogues. These dialogues are characterized by the concealment of malicious intent, contextual risk accumulation, and cross-modal joint risks, while requiring flexible policy adaptation.
20
+
21
+ To address these limitations, we propose **LLaVAShield**, a content moderation model specifically designed for multimodal multi-turn dialogues. It jointly leverages dialogue context with cross-modal signals to assess the safety of both user inputs and assistant responses under specified policy dimensions. LLaVAShield is initialized from [LLaVA-OV-7B](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) and fine-tuned on the [MMDS](https://huggingface.co/datasets/leost233/MMDS) training set. The model supports a context length of **16K**.
22
 
23
+ * 💻 **Code Repository:** [https://github.com/leost123456/LLaVAShield](https://github.com/leost123456/LLaVAShield)
24
+ * 📖 **Paper:** [https://arxiv.org/abs/2509.25896](https://arxiv.org/abs/2509.25896)
25
 
26
+ ---
27
 
28
+ ## 🚀 Quick Start (Usage)
29
  ```python
30
  #pip install
31
  from llavashield.llavashield_utils import load_shield
 
65
  print(response)
66
  ```
67
 
68
+ ## 📝 Default Prompt Template
69
+ Below is the complete default prompt containing all policy dimensions used by LLaVAShield for comprehensive safety auditing:
70
+
71
  ~~~python
72
  prompt = """
73
  You are a **Senior Multimodal Conversation Safety Auditor**, specializing in evaluating the safety of multi-turn, image-and-text interactions between a User and a Vision-Language Model (Assistant).
 
216
  ```
217
  """
218
  ~~~
219
+
220
+ # 📖 Citation
221
+ if you find our work useful for your research and applications, please kindly cite our work:
222
+
223
+ ```bibtex
224
+ @misc{huang2025llavashield,
225
+ title={LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models},
226
+ author={Guolei Huang and Qinzhi Peng and Gan Xu and Yuxuan Lu and Yongjun Shen},
227
+ year={2025},
228
+ eprint={2509.25896},
229
+ archivePrefix={arXiv},
230
+ primaryClass={cs.CV}
231
+ }
232
+ ```