Commit ·
b23ba8b
1
Parent(s): 3c336c7
Update README.md
Browse files
README.md
CHANGED
|
@@ -24,6 +24,16 @@ pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
|
|
| 24 |
print(pipeline('du bist blöd.'))
|
| 25 |
```
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## Training
|
| 29 |
|
|
|
|
| 24 |
print(pipeline('du bist blöd.'))
|
| 25 |
```
|
| 26 |
|
| 27 |
+
You can apply the pipeline on a data set.
|
| 28 |
+
|
| 29 |
+
```python
|
| 30 |
+
|
| 31 |
+
df['result'] = df['comment_text'].apply(lambda x: pipeline(x[:512])) #Cuts after max. legth of tokens for this model, which is 512 for this model.
|
| 32 |
+
# Make two new columns out of the column "results", one with the label, one with the score.
|
| 33 |
+
df['toxic_label'] = df['result'].str[0].str['label']
|
| 34 |
+
df['score'] = df['result'].str[0].str['score']
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
|
| 38 |
## Training
|
| 39 |
|