ankekat1000 commited on
Commit
281ae55
·
1 Parent(s): 1024a95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -25,17 +25,18 @@ print(pipeline('du bist blöd.'))
25
  ```
26
 
27
 
28
- ## Training data
29
 
30
  The pre-trained model [bert-base-german-cased model by deepset](https://huggingface.co/bert-base-german-cased) was fine-tuned on a crowd-annotated data set of over 14,000 user comments that has been labeled for toxicity in a binary classification task.
31
 
32
  As toxic, we defined comments that are inappropriate in whole or in part. By inappropriate, we mean comments that are rude, insulting, hateful, or otherwise make users feel disrespected.
33
 
34
- ## Training procedure
35
 
36
  **Language model:** bert-base-cased (~ 12GB)
37
  **Language:** German
 
38
  **Training data:** User comments posted to webistes and facebook pages of German news media, user comments posted to online participation platforms (~ 14,000)
 
39
  **Batch size:** 32
40
  **Epochs:** 4
41
  **Max. tokens length:** 512
 
25
  ```
26
 
27
 
28
+ ## Training
29
 
30
  The pre-trained model [bert-base-german-cased model by deepset](https://huggingface.co/bert-base-german-cased) was fine-tuned on a crowd-annotated data set of over 14,000 user comments that has been labeled for toxicity in a binary classification task.
31
 
32
  As toxic, we defined comments that are inappropriate in whole or in part. By inappropriate, we mean comments that are rude, insulting, hateful, or otherwise make users feel disrespected.
33
 
 
34
 
35
  **Language model:** bert-base-cased (~ 12GB)
36
  **Language:** German
37
+ **Labels:** Toxicity (binary classification)
38
  **Training data:** User comments posted to webistes and facebook pages of German news media, user comments posted to online participation platforms (~ 14,000)
39
+ **Labeling procedure:** Crowd annotation
40
  **Batch size:** 32
41
  **Epochs:** 4
42
  **Max. tokens length:** 512