--- language: en license: apache-2.0 tags: - toxicity - text-classification - transformers - distilbert datasets: - fizzbuzz/cleaned-toxic-comments metrics: - accuracy - f1 - precision - recall model-index: - name: distilbert-toxic-comments results: - task: type: text-classification name: Toxicity Detection dataset: name: Cleaned Toxic Comments (Kaggle) type: fizzbuzz/cleaned-toxic-comments split: test metrics: - type: accuracy value: 0.94 - type: f1 value: 0.93 - type: precision value: 0.93 - type: recall value: 0.93 --- # DistilBERT Toxic Comment Classifier 🛡️ This is a **DistilBERT-based binary classifier** fine-tuned to detect **toxic vs. non-toxic comments** using the [Cleaned Toxic Comments dataset](https://www.kaggle.com/datasets/fizzbuzz/cleaned-toxic-comments). --- ## Model Performance - **Accuracy:** ~94% - **Class metrics:** - **Non-toxic (0):** Precision 0.96 | Recall 0.95 | F1 0.95 - **Toxic (1):** Precision 0.90 | Recall 0.91 | F1 0.91 --- ## Dataset - **Name:** Cleaned Toxic Comments (FizzBuzz @ Kaggle) - **Language:** English - **Classes:** - `0` = Non-toxic - `1` = Toxic - **Balancing:** To reduce class imbalance, undersampling was applied to the majority (non-toxic) class. --- ## Training Details | Hyperparameter | Value | |----------------|-------| | Base model | `distilbert-base-uncased` | | Epochs | 3 | | Batch size | 32 | | Learning rate | 2e-5 | | Loss function | CrossEntropyLoss (with undersampling) | - **Optimizer:** AdamW - **Framework:** Hugging Face Transformers - **Hardware:** Google Colab GPU --- ## How to Use Load with the Hugging Face `pipeline`: ```python from transformers import pipeline classifier = pipeline("text-classification", model="YamenRM/distilbert-toxic-comments") print(classifier("I hate everyone, you're the worst!")) # [{'label': 'toxic', 'score': 0.97}] ``` ## Considerations Because of undersampling of non-toxic comments, the model might be less robust on very large, unbalanced datasets in real-world settings. If Toxic content is very rare in your target domain, the model might produce more false positives or negatives than expected. This model is trained only in English — performance may drop for non-English or mixed-language texts. ## Acknowledgements & License Thanks to the Kaggle community for sharing the Cleaned Toxic Comments dataset. Built using Hugging Face’s transformers & datasets libraries. License: [Apache-2.0] Contact & Feedback If you find issues, want improvements (e.g. support for other languages, finer toxicity categories), or want to collaborate, feel free to open an issue or contact me at yamenrafat132@gmail.com.