OpenSafetyLab
/

MD-Judge-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions

Foreshhh commited on Feb 8, 2024

Commit

b337fc0

·

verified ·

1 Parent(s): dc2920a

Update README.md

Files changed (1) hide show

README.md +10 -3

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ tags:
 MD-Judge is a LLM-based safetyguard, fine-tund on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). MD-Judge serves as a classifier to evaluate the safety of QA pairs.
-MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the [SALAD-Bench paper]()
 - **Developed by:** The SALAD-Bench Team
 - **Model type:** An auto-regressive language model based on the transformer architecture.
@@ -33,8 +33,7 @@ MD-Judge was born to study the safety of different LLMs serving as an general ev
 ## Model Sources
 - **Repository:** [SALAD-Bench Github](https://github.com/OpenSafetyLab/SALAD-BENCH)
-- **Dataset:** Coming soon
-- **Paper:** Coming soon
 ## Uses
 ```python
@@ -96,5 +95,13 @@ Please refer to our [Github](https://github.com/OpenSafetyLab/SALAD-BENCH) for m
 ## Citation
 ```bibtex
 ```

 MD-Judge is a LLM-based safetyguard, fine-tund on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). MD-Judge serves as a classifier to evaluate the safety of QA pairs.
+MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the [SALAD-Bench paper](https://arxiv.org/abs/2402.02416)
 - **Developed by:** The SALAD-Bench Team
 - **Model type:** An auto-regressive language model based on the transformer architecture.
 ## Model Sources
 - **Repository:** [SALAD-Bench Github](https://github.com/OpenSafetyLab/SALAD-BENCH)
+- **Paper:** [SALAD-BENCH](https://arxiv.org/abs/2402.02416)
 ## Uses
 ```python
 ## Citation
 ```bibtex
+@misc{li2024saladbench,
+      title={SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models},
+      author={Lijun Li and Bowen Dong and Ruohui Wang and Xuhao Hu and Wangmeng Zuo and Dahua Lin and Yu Qiao and Jing Shao},
+      year={2024},
+      eprint={2402.05044},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
 ```