declare-lab
/

starling-7B

Text Generation

Eval Results (legacy)

text-generation-inference

Model card Files Files and versions

Declare commited on Aug 21, 2023

Commit

837b74b

·

1 Parent(s): 833aeeb

Update README.md

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -40,4 +40,17 @@ We also release our **HarmfulQA** dataset with 1,960 harmful questions (converti
 <img src="https://declare-lab.net/assets/images/logos/data_gen.png" alt="Image" width="1000" height="1000">
-_Note: This model is referred to as Starling (Blue) in the paper. We shall soon release Starling (Blue-Red) which was trained on harmful data using an objective function that helps the model learn from the red (harmful) response data._

 <img src="https://declare-lab.net/assets/images/logos/data_gen.png" alt="Image" width="1000" height="1000">
+_Note: This model is referred to as Starling (Blue) in the paper. We shall soon release Starling (Blue-Red) which was trained on harmful data using an objective function that helps the model learn from the red (harmful) response data._
+## Citation
+```bibtex
+@misc{bhardwaj2023redteaming,
+      title={Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment},
+      author={Rishabh Bhardwaj and Soujanya Poria},
+      year={2023},
+      eprint={2308.09662},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```