duddaladeepak commited on
Commit
1f508f2
·
1 Parent(s): cea6513

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -329
README.md CHANGED
@@ -1,329 +1,10 @@
1
-
2
-
3
-
4
- <div align="center">
5
-
6
- **⚠️ Disclaimer:**
7
- The huggingface models currently give different results to the detoxify library (see issue [here](https://github.com/unitaryai/detoxify/issues/15)). For the most up to date models we recommend using the models from https://github.com/unitaryai/detoxify
8
-
9
- # 🙊 Detoxify
10
- ## Toxic Comment Classification with ⚡ Pytorch Lightning and 🤗 Transformers
11
-
12
- ![CI testing](https://github.com/unitaryai/detoxify/workflows/CI%20testing/badge.svg)
13
- ![Lint](https://github.com/unitaryai/detoxify/workflows/Lint/badge.svg)
14
-
15
- </div>
16
-
17
- ![Examples image](examples.png)
18
-
19
- ## Description
20
-
21
- Trained models & code to predict toxic comments on 3 Jigsaw challenges: Toxic comment classification, Unintended Bias in Toxic comments, Multilingual toxic comment classification.
22
-
23
- Built by [Laura Hanu](https://laurahanu.github.io/) at [Unitary](https://www.unitary.ai/), where we are working to stop harmful content online by interpreting visual content in context.
24
-
25
- Dependencies:
26
- - For inference:
27
- - 🤗 Transformers
28
- - ⚡ Pytorch lightning
29
- - For training will also need:
30
- - Kaggle API (to download data)
31
-
32
-
33
- | Challenge | Year | Goal | Original Data Source | Detoxify Model Name | Top Kaggle Leaderboard Score | Detoxify Score
34
- |-|-|-|-|-|-|-|
35
- | [Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) | 2018 | build a multi-headed model that’s capable of detecting different types of of toxicity like threats, obscenity, insults, and identity-based hate. | Wikipedia Comments | `original` | 0.98856 | 0.98636
36
- | [Jigsaw Unintended Bias in Toxicity Classification](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification) | 2019 | build a model that recognizes toxicity and minimizes this type of unintended bias with respect to mentions of identities. You'll be using a dataset labeled for identity mentions and optimizing a metric designed to measure unintended bias. | Civil Comments | `unbiased` | 0.94734 | 0.93639
37
- | [Jigsaw Multilingual Toxic Comment Classification](https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification) | 2020 | build effective multilingual models | Wikipedia Comments + Civil Comments | `multilingual` | 0.9536 | 0.91655*
38
-
39
- *Score not directly comparable since it is obtained on the validation set provided and not on the test set. To update when the test labels are made available.
40
-
41
- It is also noteworthy to mention that the top leadearboard scores have been achieved using model ensembles. The purpose of this library was to build something user-friendly and straightforward to use.
42
-
43
- ## Limitations and ethical considerations
44
-
45
- If words that are associated with swearing, insults or profanity are present in a comment, it is likely that it will be classified as toxic, regardless of the tone or the intent of the author e.g. humorous/self-deprecating. This could present some biases towards already vulnerable minority groups.
46
-
47
- The intended use of this library is for research purposes, fine-tuning on carefully constructed datasets that reflect real world demographics and/or to aid content moderators in flagging out harmful content quicker.
48
-
49
- Some useful resources about the risk of different biases in toxicity or hate speech detection are:
50
- - [The Risk of Racial Bias in Hate Speech Detection](https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf)
51
- - [Automated Hate Speech Detection and the Problem of Offensive Language](https://arxiv.org/pdf/1703.04009.pdf%201.pdf)
52
- - [Racial Bias in Hate Speech and Abusive Language Detection Datasets](https://arxiv.org/pdf/1905.12516.pdf)
53
-
54
- ## Quick prediction
55
-
56
-
57
- The `multilingual` model has been trained on 7 different languages so it should only be tested on: `english`, `french`, `spanish`, `italian`, `portuguese`, `turkish` or `russian`.
58
-
59
- ```bash
60
- # install detoxify
61
-
62
- pip install detoxify
63
-
64
- ```
65
- ```python
66
-
67
- from detoxify import Detoxify
68
-
69
- # each model takes in either a string or a list of strings
70
-
71
- results = Detoxify('original').predict('example text')
72
-
73
- results = Detoxify('unbiased').predict(['example text 1','example text 2'])
74
-
75
- results = Detoxify('multilingual').predict(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста'])
76
-
77
- # optional to display results nicely (will need to pip install pandas)
78
-
79
- import pandas as pd
80
-
81
- print(pd.DataFrame(results, index=input_text).round(5))
82
-
83
- ```
84
- For more details check the Prediction section.
85
-
86
-
87
- ## Labels
88
- All challenges have a toxicity label. The toxicity labels represent the aggregate ratings of up to 10 annotators according the following schema:
89
- - **Very Toxic** (a very hateful, aggressive, or disrespectful comment that is very likely to make you leave a discussion or give up on sharing your perspective)
90
- - **Toxic** (a rude, disrespectful, or unreasonable comment that is somewhat likely to make you leave a discussion or give up on sharing your perspective)
91
- - **Hard to Say**
92
- - **Not Toxic**
93
-
94
- More information about the labelling schema can be found [here](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data).
95
-
96
- ### Toxic Comment Classification Challenge
97
- This challenge includes the following labels:
98
-
99
- - `toxic`
100
- - `severe_toxic`
101
- - `obscene`
102
- - `threat`
103
- - `insult`
104
- - `identity_hate`
105
-
106
- ### Jigsaw Unintended Bias in Toxicity Classification
107
- This challenge has 2 types of labels: the main toxicity labels and some additional identity labels that represent the identities mentioned in the comments.
108
-
109
- Only identities with more than 500 examples in the test set (combined public and private) are included during training as additional labels and in the evaluation calculation.
110
-
111
- - `toxicity`
112
- - `severe_toxicity`
113
- - `obscene`
114
- - `threat`
115
- - `insult`
116
- - `identity_attack`
117
- - `sexual_explicit`
118
-
119
- Identity labels used:
120
- - `male`
121
- - `female`
122
- - `homosexual_gay_or_lesbian`
123
- - `christian`
124
- - `jewish`
125
- - `muslim`
126
- - `black`
127
- - `white`
128
- - `psychiatric_or_mental_illness`
129
-
130
- A complete list of all the identity labels available can be found [here](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data).
131
-
132
-
133
- ### Jigsaw Multilingual Toxic Comment Classification
134
-
135
- Since this challenge combines the data from the previous 2 challenges, it includes all labels from above, however the final evaluation is only on:
136
-
137
- - `toxicity`
138
-
139
- ## How to run
140
-
141
- First, install dependencies
142
- ```bash
143
- # clone project
144
-
145
- git clone https://github.com/unitaryai/detoxify
146
-
147
- # create virtual env
148
-
149
- python3 -m venv toxic-env
150
- source toxic-env/bin/activate
151
-
152
- # install project
153
-
154
- pip install -e detoxify
155
- cd detoxify
156
-
157
- # for training
158
- pip install -r requirements.txt
159
-
160
- ```
161
-
162
- ## Prediction
163
-
164
- Trained models summary:
165
-
166
- |Model name| Transformer type| Data from
167
- |:--:|:--:|:--:|
168
- |`original`| `bert-base-uncased` | Toxic Comment Classification Challenge
169
- |`unbiased`| `roberta-base`| Unintended Bias in Toxicity Classification
170
- |`multilingual`| `xlm-roberta-base`| Multilingual Toxic Comment Classification
171
-
172
- For a quick prediction can run the example script on a comment directly or from a txt containing a list of comments.
173
- ```bash
174
-
175
- # load model via torch.hub
176
-
177
- python run_prediction.py --input 'example' --model_name original
178
-
179
- # load model from from checkpoint path
180
-
181
- python run_prediction.py --input 'example' --from_ckpt_path model_path
182
-
183
- # save results to a .csv file
184
-
185
- python run_prediction.py --input test_set.txt --model_name original --save_to results.csv
186
-
187
- # to see usage
188
-
189
- python run_prediction.py --help
190
-
191
- ```
192
-
193
- Checkpoints can be downloaded from the latest release or via the Pytorch hub API with the following names:
194
- - `toxic_bert`
195
- - `unbiased_toxic_roberta`
196
- - `multilingual_toxic_xlm_r`
197
- ```bash
198
- model = torch.hub.load('unitaryai/detoxify','toxic_bert')
199
- ```
200
-
201
- Importing detoxify in python:
202
-
203
- ```python
204
-
205
- from detoxify import Detoxify
206
-
207
- results = Detoxify('original').predict('some text')
208
-
209
- results = Detoxify('unbiased').predict(['example text 1','example text 2'])
210
-
211
- results = Detoxify('multilingual').predict(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста'])
212
-
213
- # to display results nicely
214
-
215
- import pandas as pd
216
-
217
- print(pd.DataFrame(results,index=input_text).round(5))
218
-
219
- ```
220
-
221
-
222
- ## Training
223
-
224
- If you do not already have a Kaggle account:
225
- - you need to create one to be able to download the data
226
-
227
- - go to My Account and click on Create New API Token - this will download a kaggle.json file
228
-
229
- - make sure this file is located in ~/.kaggle
230
-
231
- ```bash
232
-
233
- # create data directory
234
-
235
- mkdir jigsaw_data
236
- cd jigsaw_data
237
-
238
- # download data
239
-
240
- kaggle competitions download -c jigsaw-toxic-comment-classification-challenge
241
-
242
- kaggle competitions download -c jigsaw-unintended-bias-in-toxicity-classification
243
-
244
- kaggle competitions download -c jigsaw-multilingual-toxic-comment-classification
245
-
246
- ```
247
- ## Start Training
248
- ### Toxic Comment Classification Challenge
249
-
250
- ```bash
251
-
252
- python create_val_set.py
253
-
254
- python train.py --config configs/Toxic_comment_classification_BERT.json
255
- ```
256
- ### Unintended Bias in Toxicicity Challenge
257
-
258
- ```bash
259
-
260
- python train.py --config configs/Unintended_bias_toxic_comment_classification_RoBERTa.json
261
-
262
- ```
263
- ### Multilingual Toxic Comment Classification
264
-
265
- This is trained in 2 stages. First, train on all available data, and second, train only on the translated versions of the first challenge.
266
-
267
- The [translated data](https://www.kaggle.com/miklgr500/jigsaw-train-multilingual-coments-google-api) can be downloaded from Kaggle in french, spanish, italian, portuguese, turkish, and russian (the languages available in the test set).
268
-
269
- ```bash
270
-
271
- # stage 1
272
-
273
- python train.py --config configs/Multilingual_toxic_comment_classification_XLMR.json
274
-
275
- # stage 2
276
-
277
- python train.py --config configs/Multilingual_toxic_comment_classification_XLMR_stage2.json
278
-
279
- ```
280
- ### Monitor progress with tensorboard
281
-
282
- ```bash
283
-
284
- tensorboard --logdir=./saved
285
-
286
- ```
287
- ## Model Evaluation
288
-
289
- ### Toxic Comment Classification Challenge
290
-
291
- This challenge is evaluated on the mean AUC score of all the labels.
292
-
293
- ```bash
294
-
295
- python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv
296
-
297
- ```
298
- ### Unintended Bias in Toxicicity Challenge
299
-
300
- This challenge is evaluated on a novel bias metric that combines different AUC scores to balance overall performance. More information on this metric [here](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/overview/evaluation).
301
-
302
- ```bash
303
-
304
- python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv
305
-
306
- # to get the final bias metric
307
- python model_eval/compute_bias_metric.py
308
-
309
- ```
310
- ### Multilingual Toxic Comment Classification
311
-
312
- This challenge is evaluated on the AUC score of the main toxic label.
313
-
314
- ```bash
315
-
316
- python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv
317
-
318
- ```
319
-
320
- ### Citation
321
- ```
322
- @misc{Detoxify,
323
- title={Detoxify},
324
- author={Hanu, Laura and {Unitary team}},
325
- howpublished={Github. https://github.com/unitaryai/detoxify},
326
- year={2020}
327
- }
328
- ```
329
-
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: "text-classification"
4
+ language: en
5
+ tags:
6
+ - Bert
7
+ widget:
8
+ - text: ["Is this review positive or negative? Review: Best cast iron skillet you will every buy."]
9
+
10
+ ---