JungleLee
/

bert-toxic-comment-classification

Text Classification

text-embeddings-inference

Model card Files Files and versions

JungleLee commited on Mar 10, 2023

Commit

c5769ef

·

1 Parent(s): 3f0efc6

Create README.md

Files changed (1) hide show

README.md +36 -0

README.md ADDED Viewed

	@@ -0,0 +1,36 @@

+---
+license: afl-3.0
+datasets:
+- jigsaw_toxicity_pred
+language:
+- en
+metrics:
+- accuracy
+library_name: transformers
+pipeline_tag: text-classification
+---
+## Model description
+This model is a fine-tuned version of the [bert-base-uncased model](https://huggingface.co/transformers/model_doc/bert.html) to classify toxic comments.
+## How to use
+You can use the model with the following code.
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer, TextClassificationPipeline
+model_path = "JungleLee/bert-toxic-comment-classification"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = AutoModelForSequenceClassification.from_pretrained(model_path)
+pipeline =  TextClassificationPipeline(model=model, tokenizer=tokenizer)
+print(pipeline('You're a fucking nerd.'))
+```
+## Training data
+The training data comes this [Kaggle competition](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data). We use 90% of the `train.csv` data to train the model.
+## Evaluation results
+The model achieves 0.95 AUC in a 1500 rows held-out test set.