---
language: en
license: apache-2.0
tags:
- toxicity
- text-classification
- transformers
- distilbert
datasets:
- fizzbuzz/cleaned-toxic-comments
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: distilbert-toxic-comments
  results:
  - task:
      type: text-classification
      name: Toxicity Detection
    dataset:
      name: Cleaned Toxic Comments (Kaggle)
      type: fizzbuzz/cleaned-toxic-comments
      split: test
    metrics:
    - type: accuracy
      value: 0.94
    - type: f1
      value: 0.93
    - type: precision
      value: 0.93
    - type: recall
      value: 0.93
---

# DistilBERT Toxic Comment Classifier 🛡️

This is a **DistilBERT-based binary classifier** fine-tuned to detect **toxic vs. non-toxic comments** using the [Cleaned Toxic Comments dataset](https://www.kaggle.com/datasets/fizzbuzz/cleaned-toxic-comments).

---

## Model Performance

- **Accuracy:** ~94%  
- **Class metrics:**
  - **Non-toxic (0):** Precision 0.96 | Recall 0.95 | F1 0.95  
  - **Toxic (1):** Precision 0.90 | Recall 0.91 | F1 0.91  

---

## Dataset

- **Name:** Cleaned Toxic Comments (FizzBuzz @ Kaggle)  
- **Language:** English  
- **Classes:**  
  - `0` = Non-toxic  
  - `1` = Toxic  
- **Balancing:** To reduce class imbalance, undersampling was applied to the majority (non-toxic) class.  

---

## Training Details

| Hyperparameter | Value |
|----------------|-------|
| Base model | `distilbert-base-uncased` |
| Epochs | 3 |
| Batch size | 32 |
| Learning rate | 2e-5 |
| Loss function | CrossEntropyLoss (with undersampling) |

- **Optimizer:** AdamW  
- **Framework:** Hugging Face Transformers  
- **Hardware:** Google Colab GPU   

---

## How to Use

Load with the Hugging Face `pipeline`:

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="YamenRM/distilbert-toxic-comments")

print(classifier("I hate everyone, you're the worst!"))
# [{'label': 'toxic', 'score': 0.97}]
```
## Considerations

Because of undersampling of non-toxic comments, the model might be less robust on very large, unbalanced datasets in real-world settings.

If Toxic content is very rare in your target domain, the model might produce more false positives or negatives than expected.

This model is trained only in English — performance may drop for non-English or mixed-language texts.

## Acknowledgements & License

Thanks to the Kaggle community for sharing the Cleaned Toxic Comments dataset.

Built using Hugging Face’s transformers & datasets libraries.

License: [Apache-2.0]

Contact & Feedback

If you find issues, want improvements (e.g. support for other languages, finer toxicity categories), or want to collaborate, feel free to open an issue or contact me at yamenrafat132@gmail.com.