OpenSafetyLab
/

MD-Judge-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions

Foreshhh commited on Feb 7, 2024

Commit

5abb0c1

·

verified ·

1 Parent(s): b91f0a1

Update README.md

Files changed (1) hide show

README.md +46 -0

README.md CHANGED Viewed

@@ -1,3 +1,49 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+datasets:
+- lmsys/toxic-chat
+- PKU-Alignment/BeaverTails
+- lmsys/lmsys-chat-1m
+language:
+- en
+metrics:
+- f1
+- accuracy
+tags:
+- ai-safety
+- safetyguard
+- safety
+- benchmark
+- mistral
+- salad-bench
+- evluation
 ---
+# MD-Judge for Salad-Bench
+## Model Details
+MD-Judge is a LLM-based safetyguard, fine-tund on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). MD-Judge serves as a classifier to evaluate the safety of QA pairs.
+MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the [SALAD-Bench paper]()
+- **Developed by:** The SALAD-Bench Team
+- **Model type:** An auto-regressive language model based on the transformer architecture.
+## Model Sources
+- **Repository:** [SALAD-Bench Github](https://github.com/OpenSafetyLab/SALAD-BENCH)
+- **Dataset:** Coming soon
+- **Paper:** Coming soon
+## Uses
+Please refer to our [Github](https://github.com/OpenSafetyLab/SALAD-BENCH) for more using examples
+```python
+```
+## Citation
+**BibTeX:**