| | --- |
| | language: en |
| | license: apache-2.0 |
| | tags: |
| | - toxicity |
| | - text-classification |
| | - transformers |
| | - distilbert |
| | datasets: |
| | - fizzbuzz/cleaned-toxic-comments |
| | metrics: |
| | - accuracy |
| | - f1 |
| | - precision |
| | - recall |
| | model-index: |
| | - name: distilbert-toxic-comments |
| | results: |
| | - task: |
| | type: text-classification |
| | name: Toxicity Detection |
| | dataset: |
| | name: Cleaned Toxic Comments (Kaggle) |
| | type: fizzbuzz/cleaned-toxic-comments |
| | split: test |
| | metrics: |
| | - type: accuracy |
| | value: 0.94 |
| | - type: f1 |
| | value: 0.93 |
| | - type: precision |
| | value: 0.93 |
| | - type: recall |
| | value: 0.93 |
| | --- |
| | |
| | # DistilBERT Toxic Comment Classifier 🛡️ |
| |
|
| | This is a **DistilBERT-based binary classifier** fine-tuned to detect **toxic vs. non-toxic comments** using the [Cleaned Toxic Comments dataset](https://www.kaggle.com/datasets/fizzbuzz/cleaned-toxic-comments). |
| |
|
| | --- |
| |
|
| | ## Model Performance |
| |
|
| | - **Accuracy:** ~94% |
| | - **Class metrics:** |
| | - **Non-toxic (0):** Precision 0.96 | Recall 0.95 | F1 0.95 |
| | - **Toxic (1):** Precision 0.90 | Recall 0.91 | F1 0.91 |
| |
|
| | --- |
| |
|
| | ## Dataset |
| |
|
| | - **Name:** Cleaned Toxic Comments (FizzBuzz @ Kaggle) |
| | - **Language:** English |
| | - **Classes:** |
| | - `0` = Non-toxic |
| | - `1` = Toxic |
| | - **Balancing:** To reduce class imbalance, undersampling was applied to the majority (non-toxic) class. |
| |
|
| | --- |
| |
|
| | ## Training Details |
| |
|
| | | Hyperparameter | Value | |
| | |----------------|-------| |
| | | Base model | `distilbert-base-uncased` | |
| | | Epochs | 3 | |
| | | Batch size | 32 | |
| | | Learning rate | 2e-5 | |
| | | Loss function | CrossEntropyLoss (with undersampling) | |
| |
|
| | - **Optimizer:** AdamW |
| | - **Framework:** Hugging Face Transformers |
| | - **Hardware:** Google Colab GPU |
| |
|
| | --- |
| |
|
| | ## How to Use |
| |
|
| | Load with the Hugging Face `pipeline`: |
| |
|
| | ```python |
| | from transformers import pipeline |
| | |
| | classifier = pipeline("text-classification", model="YamenRM/distilbert-toxic-comments") |
| | |
| | print(classifier("I hate everyone, you're the worst!")) |
| | # [{'label': 'toxic', 'score': 0.97}] |
| | ``` |
| | ## Considerations |
| |
|
| | Because of undersampling of non-toxic comments, the model might be less robust on very large, unbalanced datasets in real-world settings. |
| |
|
| | If Toxic content is very rare in your target domain, the model might produce more false positives or negatives than expected. |
| |
|
| | This model is trained only in English — performance may drop for non-English or mixed-language texts. |
| |
|
| | ## Acknowledgements & License |
| |
|
| | Thanks to the Kaggle community for sharing the Cleaned Toxic Comments dataset. |
| |
|
| | Built using Hugging Face’s transformers & datasets libraries. |
| |
|
| | License: [Apache-2.0] |
| |
|
| | Contact & Feedback |
| |
|
| | If you find issues, want improvements (e.g. support for other languages, finer toxicity categories), or want to collaborate, feel free to open an issue or contact me at yamenrafat132@gmail.com. |