File size: 2,775 Bytes
319327d
2133d55
259d643
2133d55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
319327d
 
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
 
 
 
319327d
2133d55
319327d
2133d55
319327d
2133d55
 
 
 
 
 
319327d
2133d55
319327d
 
 
2133d55
 
 
 
 
 
 
319327d
2133d55
 
 
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
 
319327d
ca44674
319327d
2133d55
 
 
 
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
319327d
2133d55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
---
language: en
license: apache-2.0
tags:
- toxicity
- text-classification
- transformers
- distilbert
datasets:
- fizzbuzz/cleaned-toxic-comments
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: distilbert-toxic-comments
  results:
  - task:
      type: text-classification
      name: Toxicity Detection
    dataset:
      name: Cleaned Toxic Comments (Kaggle)
      type: fizzbuzz/cleaned-toxic-comments
      split: test
    metrics:
    - type: accuracy
      value: 0.94
    - type: f1
      value: 0.93
    - type: precision
      value: 0.93
    - type: recall
      value: 0.93
---

# DistilBERT Toxic Comment Classifier 🛡️

This is a **DistilBERT-based binary classifier** fine-tuned to detect **toxic vs. non-toxic comments** using the [Cleaned Toxic Comments dataset](https://www.kaggle.com/datasets/fizzbuzz/cleaned-toxic-comments).

---

## Model Performance

- **Accuracy:** ~94%  
- **Class metrics:**
  - **Non-toxic (0):** Precision 0.96 | Recall 0.95 | F1 0.95  
  - **Toxic (1):** Precision 0.90 | Recall 0.91 | F1 0.91  

---

## Dataset

- **Name:** Cleaned Toxic Comments (FizzBuzz @ Kaggle)  
- **Language:** English  
- **Classes:**  
  - `0` = Non-toxic  
  - `1` = Toxic  
- **Balancing:** To reduce class imbalance, undersampling was applied to the majority (non-toxic) class.  

---

## Training Details

| Hyperparameter | Value |
|----------------|-------|
| Base model | `distilbert-base-uncased` |
| Epochs | 3 |
| Batch size | 32 |
| Learning rate | 2e-5 |
| Loss function | CrossEntropyLoss (with undersampling) |

- **Optimizer:** AdamW  
- **Framework:** Hugging Face Transformers  
- **Hardware:** Google Colab GPU   

---

## How to Use

Load with the Hugging Face `pipeline`:

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="YamenRM/distilbert-toxic-comments")

print(classifier("I hate everyone, you're the worst!"))
# [{'label': 'toxic', 'score': 0.97}]
```
## Considerations

Because of undersampling of non-toxic comments, the model might be less robust on very large, unbalanced datasets in real-world settings.

If Toxic content is very rare in your target domain, the model might produce more false positives or negatives than expected.

This model is trained only in English — performance may drop for non-English or mixed-language texts.

## Acknowledgements & License

Thanks to the Kaggle community for sharing the Cleaned Toxic Comments dataset.

Built using Hugging Face’s transformers & datasets libraries.

License: [Apache-2.0]

Contact & Feedback

If you find issues, want improvements (e.g. support for other languages, finer toxicity categories), or want to collaborate, feel free to open an issue or contact me at yamenrafat132@gmail.com.