RealSafe
/

LLaVAShield-v1.0-7B

Safetensors

English

llava

Model card Files Files and versions

xet

Community

leost233 commited on 5 days ago

Commit

7ba5816

verified ·

1 Parent(s): bd15114

Update README.md

Browse files

Files changed (1) hide show

README.md +27 -6

README.md CHANGED Viewed

@@ -6,7 +6,9 @@ base_model:
 - lmms-lab/llava-onevision-qwen2-7b-ov
 ---
-# LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
 ## 📢 News
@@ -14,13 +16,16 @@ base_model:
 ## 💎 About LLaVAShield
-As Vision-Language Models (VLMs) move into interactive, multi-turn use, safety concerns intensify for multimodal multi-turn dialogue. These dialogues are characterized by concealment of malicious intent, contextual risk accumulation, and cross-modal joint risk，并支持灵活的政策适配。
-To address these limitations, we propose LLaVAShield, a content moderation model for multimodal multi-turn dialogues. It jointly leverages dialogue context with cross-modal signals to assess the safety of both user inputs and assistant responses under specified policy dimensions. LLaVAShield is initialized from [LLaVA-OV-7B](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) and fine-tuned on the [MMDS](https://huggingface.co/datasets/leost233/MMDS) training set. 上下文长度为16K.
-code网站： paper网站：
-## Usage
 ```python
 #pip install
 from llavashield.llavashield_utils import load_shield
@@ -60,7 +65,9 @@ response = processor.tokenizer.batch_decode(outputs, skip_special_tokens=True)[0
 print(response)
 ```
-默认的完整prompt如下所示：
 ~~~python
 prompt = """
 You are a **Senior Multimodal Conversation Safety Auditor**, specializing in evaluating the safety of multi-turn, image-and-text interactions between a User and a Vision-Language Model (Assistant).
@@ -209,3 +216,17 @@ The conversation history is in JSON format and is enclosed in square brackets []
 ```
 """
 ~~~

 - lmms-lab/llava-onevision-qwen2-7b-ov
 ---
+# 🛡️ LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
+[![Paper](https://img.shields.io/badge/Paper-arXiv-red)](https://arxiv.org/abs/2509.25896) [![Dataset](https://img.shields.io/badge/Dataset-MMDS-blue)](https://huggingface.co/datasets/leost233/MMDS) [![Code](https://img.shields.io/badge/Code-GitHub-black)](https://github.com/leost123456/LLaVAShield)
 ## 📢 News
 ## 💎 About LLaVAShield
+As Vision-Language Models (VLMs) move into interactive, multi-turn use, safety concerns intensify for multimodal multi-turn dialogues. These dialogues are characterized by the concealment of malicious intent, contextual risk accumulation, and cross-modal joint risks, while requiring flexible policy adaptation.
+To address these limitations, we propose **LLaVAShield**, a content moderation model specifically designed for multimodal multi-turn dialogues. It jointly leverages dialogue context with cross-modal signals to assess the safety of both user inputs and assistant responses under specified policy dimensions. LLaVAShield is initialized from [LLaVA-OV-7B](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) and fine-tuned on the [MMDS](https://huggingface.co/datasets/leost233/MMDS) training set. The model supports a context length of **16K**.
+* 💻 **Code Repository:** [https://github.com/leost123456/LLaVAShield](https://github.com/leost123456/LLaVAShield)
+* 📖 **Paper:** [https://arxiv.org/abs/2509.25896](https://arxiv.org/abs/2509.25896)
+---
+## 🚀 Quick Start (Usage)
 ```python
 #pip install
 from llavashield.llavashield_utils import load_shield
 print(response)
 ```
+## 📝 Default Prompt Template
+Below is the complete default prompt containing all policy dimensions used by LLaVAShield for comprehensive safety auditing:
 ~~~python
 prompt = """
 You are a **Senior Multimodal Conversation Safety Auditor**, specializing in evaluating the safety of multi-turn, image-and-text interactions between a User and a Vision-Language Model (Assistant).
 ```
 """
 ~~~
+# 📖 Citation
+if you find our work useful for your research and applications, please kindly cite our work:
+```bibtex
+@misc{huang2025llavashield,
+      title={LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models},
+      author={Guolei Huang and Qinzhi Peng and Gan Xu and Yuxuan Lu and Yongjun Shen},
+      year={2025},
+      eprint={2509.25896},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV}
+}
+```