Toxicity_model / README.md

YamenRM

Update README.md

ca44674 verified 6 months ago

preview code

raw

history blame contribute delete

2.78 kB

metadata

language: en
license: apache-2.0
tags:
  - toxicity
  - text-classification
  - transformers
  - distilbert
datasets:
  - fizzbuzz/cleaned-toxic-comments
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: distilbert-toxic-comments
    results:
      - task:
          type: text-classification
          name: Toxicity Detection
        dataset:
          name: Cleaned Toxic Comments (Kaggle)
          type: fizzbuzz/cleaned-toxic-comments
          split: test
        metrics:
          - type: accuracy
            value: 0.94
          - type: f1
            value: 0.93
          - type: precision
            value: 0.93
          - type: recall
            value: 0.93

DistilBERT Toxic Comment Classifier 🛡️

This is a DistilBERT-based binary classifier fine-tuned to detect toxic vs. non-toxic comments using the Cleaned Toxic Comments dataset.

Model Performance

Accuracy: ~94%
Class metrics:
- Non-toxic (0): Precision 0.96 | Recall 0.95 | F1 0.95
- Toxic (1): Precision 0.90 | Recall 0.91 | F1 0.91

Dataset

Name: Cleaned Toxic Comments (FizzBuzz @ Kaggle)
Language: English
Classes:
- 0 = Non-toxic
- 1 = Toxic
Balancing: To reduce class imbalance, undersampling was applied to the majority (non-toxic) class.

Training Details

Hyperparameter	Value
Base model	`distilbert-base-uncased`
Epochs	3
Batch size	32
Learning rate	2e-5
Loss function	CrossEntropyLoss (with undersampling)

Optimizer: AdamW
Framework: Hugging Face Transformers
Hardware: Google Colab GPU

How to Use

Load with the Hugging Face pipeline:

from transformers import pipeline

classifier = pipeline("text-classification", model="YamenRM/distilbert-toxic-comments")

print(classifier("I hate everyone, you're the worst!"))
# [{'label': 'toxic', 'score': 0.97}]

Considerations

Because of undersampling of non-toxic comments, the model might be less robust on very large, unbalanced datasets in real-world settings.

If Toxic content is very rare in your target domain, the model might produce more false positives or negatives than expected.

This model is trained only in English — performance may drop for non-English or mixed-language texts.

Acknowledgements & License

Thanks to the Kaggle community for sharing the Cleaned Toxic Comments dataset.

Built using Hugging Face’s transformers & datasets libraries.

License: [Apache-2.0]

Contact & Feedback

If you find issues, want improvements (e.g. support for other languages, finer toxicity categories), or want to collaborate, feel free to open an issue or contact me at yamenrafat132@gmail.com.