inclusionAI
/

Sing-Guard-8b

@@ -1,10 +1,13 @@
 ---
-license: apache-2.0
-language:
-- en
 base_model:
 - Qwen/Qwen3-VL-8B-Instruct
 ---
 <p align="center">
   <img src="assets/s_icon.png" width="48" alt="SingGuard icon">
 </p>
@@ -15,6 +18,7 @@ base_model:
 <p align="center">
     <a href="https://huggingface.co/collections/inclusionAI/sing-guard">🤗 HuggingFace</a> &nbsp; | &nbsp;
     <a href="https://modelscope.cn/collections/inclusionAI/Sing-Guard">🤖 ModelScope</a> &nbsp; | &nbsp;
     <a href="https://arxiv.org/abs/2606.22873">📄 Paper</a>
 </p>
@@ -344,48 +348,25 @@ The first line is the binary judgment, and `<answer>` contains the final risk ca
 - Production systems should handle malformed outputs, such as an unparsable first line, missing `<answer>`, or a category outside the active policy.
 - For multimodal inputs, make sure image paths are accessible to the local inference environment.
-## Risk Categories
-The default full policy contains the following risk categories. When a dynamic policy is provided, the model judges only against the active `policy` instead of forcing every case into the default categories.
-### A. Sexual Content Risk
-- Content involving explicit sexual material, exploitation, or coercive sexual acts.
-### B. Real-World Crimes & Public Safety
-- Content involving violent crime, weapons, other crimes, or public-safety threats.
-### C. Unethical Behavior
-- Content involving hate, harassment, manipulation, self-harm, disturbing imagery, or harmful misinformation.
-### D. Cybersecurity & Information Manipulation
-- Content involving data leaks, hacking, surveillance abuse, platform abuse, or copyright abuse.
-### E. Agent Safety
-- Content attempting to expose system prompts, internal policies, or other model safeguards.
-### F. Politically Sensitive Content
-- Content involving political advocacy, rumors, unrest, historical distortion, or attacks on political figures.
-### G. Animal Abuse
-- Content involving cruelty to animals or the spread of animal abuse.
-### Safe
-- Content that does not match any active risk category.
 ## Citation
 ```bibtex
 @article{singguard2026,
   title={SingGuard: Policy-Adaptive Multimodal Safeguarding with Dynamic Reasoning},
-  author={Ant Group},
   year={2026}
 }
 ```

 ---
 base_model:
 - Qwen/Qwen3-VL-8B-Instruct
+language:
+- en
+license: apache-2.0
+library_name: transformers
+pipeline_tag: image-text-to-text
 ---
 <p align="center">
   <img src="assets/s_icon.png" width="48" alt="SingGuard icon">
 </p>
 <p align="center">
     <a href="https://huggingface.co/collections/inclusionAI/sing-guard">🤗 HuggingFace</a> &nbsp; | &nbsp;
     <a href="https://modelscope.cn/collections/inclusionAI/Sing-Guard">🤖 ModelScope</a> &nbsp; | &nbsp;
+    <a href="https://github.com/inclusionAI/Sing-Guard">💻 GitHub</a> &nbsp; | &nbsp;
     <a href="https://arxiv.org/abs/2606.22873">📄 Paper</a>
 </p>
 - Production systems should handle malformed outputs, such as an unparsable first line, missing `<answer>`, or a category outside the active policy.
 - For multimodal inputs, make sure image paths are accessible to the local inference environment.
+## Safety Policy
+SingGuard's default policy uses eight top-level categories. When a dynamic policy is provided, the model judges only against the active `policy` instead of forcing every case into the default categories.
+* **A. Sexual Content Risk:** Content involving explicit sexual material, exploitation, or coercive sexual acts.
+* **B. Real-World Crimes & Public Safety:** Content involving violent crime, weapons, other crimes, or public-safety threats.
+* **C. Unethical Behavior:** Content involving hate, harassment, manipulation, self-harm, disturbing imagery, or harmful misinformation.
+* **D. Cybersecurity & Information Manipulation:** Content involving data leaks, hacking, surveillance abuse, platform abuse, or copyright abuse.
+* **E. Agent Safety:** Content attempting to expose system prompts, internal policies, or other model safeguards.
+* **F. Politically Sensitive Content:** Content involving political advocacy, rumors, unrest, historical distortion, or attacks on political figures.
+* **G. Animal Abuse:** Content involving cruelty to animals or the spread of animal abuse.
+* **Safe:** Content that does not match any active risk category.
 ## Citation
 ```bibtex
 @article{singguard2026,
   title={SingGuard: Policy-Adaptive Multimodal Safeguarding with Dynamic Reasoning},
+  author={Li, Zongyi and Yin, Shenglin and Liao, Bingyan and Bai, Yichen and He, Liangbo and Xiu, Kedong and Li, Hongcheng and Lan, Jun and Cui, Shiwen and Xu, Tingting and Song, Chuanbiao and Yu, Zijian and Hong, Yan and Li, Siyuan and Xu, Chao and Zhu, Huijia and Meng, Changhua and Wang, Weiqiang},
   year={2026}
 }
 ```