ankekat1000
/

toxic-bert-german

Text Classification

Model card Files Files and versions

ankekat1000 commited on Oct 25, 2023

Commit

b23ba8b

·

1 Parent(s): 3c336c7

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -24,6 +24,16 @@ pipeline =  TextClassificationPipeline(model=model, tokenizer=tokenizer)
 print(pipeline('du bist blöd.'))
 ```
 ## Training

 print(pipeline('du bist blöd.'))
 ```
+You can apply the pipeline on a data set.
+```python
+df['result'] = df['comment_text'].apply(lambda x: pipeline(x[:512])) #Cuts after max. legth of tokens for this model, which is 512 for this model.
+# Make two new columns out of the column "results", one with the label, one with the score.
+df['toxic_label'] = df['result'].str[0].str['label']
+df['score'] = df['result'].str[0].str['score']
+```
 ## Training