| license: mit | |
| language: | |
| - en | |
| metrics: | |
| - accuracy | |
| # Toxic post classification using DistilBert | |
| Use a pretrained DistilBert to train a classifier on the Toxic Comment dataset https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge. | |
| The goal is to classify whether a comment is toxic or not. Note that the labels from the original datasets are more fine-grained (i.e. different types of toxicity). | |
| The model here obatains a test accuracy of 95% on a balanced split. |