Commit
·
4fa6ac6
1
Parent(s):
376faae
Update README.md
Browse files
README.md
CHANGED
|
@@ -2,17 +2,22 @@
|
|
| 2 |
language: fr # <-- my language
|
| 3 |
widget:
|
| 4 |
- text: "J'aime ta coiffure"
|
|
|
|
| 5 |
- text: "Va te faire foutre"
|
|
|
|
| 6 |
- text: "Quel mauvais temps, n'est-ce pas ?"
|
|
|
|
| 7 |
- text: "J'espère que tu vas mourir, connard !"
|
|
|
|
| 8 |
- text: "j'aime beaucoup ta veste"
|
|
|
|
| 9 |
|
| 10 |
license: other
|
| 11 |
---
|
| 12 |
This model was trained for toxicity labeling. Label_1 means TOXIC, Label_0 means NOT TOXIC
|
| 13 |
|
| 14 |
-
The model was fine-tuned based off the CamemBERT language model
|
| 15 |
|
| 16 |
The accuracy is 93% on the test split during training and 79% on a manually picked (and thus harder) sample of 200 sentences (100 label 1, 100 label 0) at the end of the training.
|
| 17 |
|
| 18 |
-
The model was finetuned on 32k sentences. The train data was the translations of the
|
|
|
|
| 2 |
language: fr # <-- my language
|
| 3 |
widget:
|
| 4 |
- text: "J'aime ta coiffure"
|
| 5 |
+
example_title: "NOT TOXIC 1"
|
| 6 |
- text: "Va te faire foutre"
|
| 7 |
+
example_title: "TOXIC 1"
|
| 8 |
- text: "Quel mauvais temps, n'est-ce pas ?"
|
| 9 |
+
example_title: "NOT TOXIC 2"
|
| 10 |
- text: "J'espère que tu vas mourir, connard !"
|
| 11 |
+
example_title: "TOXIC 2"
|
| 12 |
- text: "j'aime beaucoup ta veste"
|
| 13 |
+
example_title: "NOT TOXIC 3"
|
| 14 |
|
| 15 |
license: other
|
| 16 |
---
|
| 17 |
This model was trained for toxicity labeling. Label_1 means TOXIC, Label_0 means NOT TOXIC
|
| 18 |
|
| 19 |
+
The model was fine-tuned based off [the CamemBERT language model](https://huggingface.co/camembert-base).
|
| 20 |
|
| 21 |
The accuracy is 93% on the test split during training and 79% on a manually picked (and thus harder) sample of 200 sentences (100 label 1, 100 label 0) at the end of the training.
|
| 22 |
|
| 23 |
+
The model was finetuned on 32k sentences. The train data was the translations of the English data (around 30k sentences) from [the multilingual_detox dataset](https://github.com/s-nlp/multilingual_detox) by [Skolkovo Institute](https://huggingface.co/SkolkovoInstitute) using [the opus-mt-en-fr translation model](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr) by [Helsinki-NLP](https://huggingface.co/Helsinki-NLP) and the data from [the jigsaw dataset](https://www.kaggle.com/competitions/jigsaw-multilingual-toxic-comment-classification/data) on kaggle.
|