Update README.md
Browse files
README.md
CHANGED
|
@@ -2,6 +2,15 @@
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
pipeline_tag: text-classification
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
---
|
| 6 |
-
It is a model
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
pipeline_tag: text-classification
|
| 5 |
+
license: mit
|
| 6 |
+
metrics:
|
| 7 |
+
- accuracy
|
| 8 |
+
- f1
|
| 9 |
---
|
| 10 |
+
This model is part of the research presented in "Mitigating Toxicity in Dialogue Agents through Adversarial Reinforcement Learning," a conference paper addressing dialog agent toxicity by mitigating it at three levels: explicit, implicit, and contextual. It is a model capable of predicting toxicity given a history and a response to it. It is designed for dialog agents. To use it correctly, please follow the schematics below:
|
| 11 |
+
|
| 12 |
+
[HST]Hi, how are you?[END]I am doing fine[ANS]I hope you die.
|
| 13 |
+
|
| 14 |
+
The token [HST] initiates the history of the conversation, and each turn pair is separated by [END]. The token [ANS] indicates the start of the response to the last utterance. I will update this card, but right now, I am developing a bigger project with these, so I do not have the time to indicate all the results.
|
| 15 |
+
|
| 16 |
+
The datasets used to train the model were the Dialogue Safety dataset and Bot Adversarial dataset.
|