DloadingX
/

GuardTrace-VL-3B

vision-language

risk-assessment

Model card Files Files and versions

DloadingX commited on Mar 9

Commit

3c51f42

·

verified ·

1 Parent(s): 08a2312

Update README.md

Files changed (1) hide show

README.md +7 -9

README.md CHANGED Viewed

@@ -2,7 +2,6 @@
 license: apache-2.0
 language:
 - en
-- zh
 tags:
 - vision-language
 - safety-audit
@@ -21,7 +20,7 @@ GuardTrace-VL-3B is a vision-language model fine-tuned on Qwen2.5-VL-3B-Instruct
 - **Input**: Image + Text (user query, AI thinking process, AI response)
 - **Output**: Safety risk analysis + risk level (0/0.5/1)
 - **Supported Languages**: English, Chinese
-- **License**: MIT
 ## Quick Start (Minimal Demo)
 ### 1. Install Dependencies
@@ -155,7 +154,7 @@ The model outputs a structured safety analysis including three core parts:
 | 1     | Harmful             | AI's reasoning/response contains detailed instructions/guidance that directly encourages harmful actions |
 ## Limitations
-- The model is optimized for safety assessment of English/Chinese multimodal inputs only; performance on other languages is untested
 - May misclassify highly disguised harmful queries (e.g., educational/hypothetical framing of harmful content)
 - Low-quality/blurry images may reduce the accuracy of multimodal safety assessment
 - Does not support real-time streaming inference for long-form content
@@ -163,10 +162,9 @@ The model outputs a structured safety analysis including three core parts:
 ## Citation
 If you use this model in your research, please cite:
 ```bibtex
-@misc{guardtrace-vl-3b,
-  title={GuardTrace-VL-3B: Multimodal LLM Safety Risk Assessment Model},
-  author={Your Name},
-  year={2026},
-  url={https://huggingface.co/your-username/GuardTrace-VL-3B}
 }

 license: apache-2.0
 language:
 - en
 tags:
 - vision-language
 - safety-audit
 - **Input**: Image + Text (user query, AI thinking process, AI response)
 - **Output**: Safety risk analysis + risk level (0/0.5/1)
 - **Supported Languages**: English, Chinese
+- **License**: Apache 2.0
 ## Quick Start (Minimal Demo)
 ### 1. Install Dependencies
 | 1     | Harmful             | AI's reasoning/response contains detailed instructions/guidance that directly encourages harmful actions |
 ## Limitations
+- The model is optimized for safety assessment of English multimodal inputs only; performance on other languages is untested
 - May misclassify highly disguised harmful queries (e.g., educational/hypothetical framing of harmful content)
 - Low-quality/blurry images may reduce the accuracy of multimodal safety assessment
 - Does not support real-time streaming inference for long-form content
 ## Citation
 If you use this model in your research, please cite:
 ```bibtex
+@article{xiang2025guardtrace,
+  title={GuardTrace-VL: Detecting Unsafe Multimodel Reasoning via Iterative Safety Supervision},
+  author={Xiang, Yuxiao and Chen, Junchi and Jin, Zhenchao and Miao, Changtao and Yuan, Haojie and Chu, Qi and Gong, Tao and Yu, Nenghai},
+  journal={arXiv preprint arXiv:2511.20994},
+  year={2025}
 }