YamenRM
/

Toxicity_model

Text Classification

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

Toxicity_model / README.md

YamenRM's picture

Update README.md

ca44674 verified 6 months ago

|

history blame contribute delete

2.78 kB

	---
	language: en
	license: apache-2.0
	tags:
	- toxicity
	- text-classification
	- transformers
	- distilbert
	datasets:
	- fizzbuzz/cleaned-toxic-comments
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: distilbert-toxic-comments
	results:
	- task:
	type: text-classification
	name: Toxicity Detection
	dataset:
	name: Cleaned Toxic Comments (Kaggle)
	type: fizzbuzz/cleaned-toxic-comments
	split: test
	metrics:
	- type: accuracy
	value: 0.94
	- type: f1
	value: 0.93
	- type: precision
	value: 0.93
	- type: recall
	value: 0.93
	---

	# DistilBERT Toxic Comment Classifier 🛡️

	This is a DistilBERT-based binary classifier fine-tuned to detect toxic vs. non-toxic comments using the [Cleaned Toxic Comments dataset](https://www.kaggle.com/datasets/fizzbuzz/cleaned-toxic-comments).

	---

	## Model Performance

	- Accuracy: ~94%
	- Class metrics:
	- Non-toxic (0): Precision 0.96 \| Recall 0.95 \| F1 0.95
	- Toxic (1): Precision 0.90 \| Recall 0.91 \| F1 0.91

	---

	## Dataset

	- Name: Cleaned Toxic Comments (FizzBuzz @ Kaggle)
	- Language: English
	- Classes:
	- `0` = Non-toxic
	- `1` = Toxic
	- Balancing: To reduce class imbalance, undersampling was applied to the majority (non-toxic) class.

	---

	## Training Details

	\| Hyperparameter \| Value \|
	\|----------------\|-------\|
	\| Base model \| `distilbert-base-uncased` \|
	\| Epochs \| 3 \|
	\| Batch size \| 32 \|
	\| Learning rate \| 2e-5 \|
	\| Loss function \| CrossEntropyLoss (with undersampling) \|

	- Optimizer: AdamW
	- Framework: Hugging Face Transformers
	- Hardware: Google Colab GPU

	---

	## How to Use

	Load with the Hugging Face `pipeline`:

	```python
	from transformers import pipeline

	classifier = pipeline("text-classification", model="YamenRM/distilbert-toxic-comments")

	print(classifier("I hate everyone, you're the worst!"))
	# [{'label': 'toxic', 'score': 0.97}]
	```
	## Considerations

	Because of undersampling of non-toxic comments, the model might be less robust on very large, unbalanced datasets in real-world settings.

	If Toxic content is very rare in your target domain, the model might produce more false positives or negatives than expected.

	This model is trained only in English — performance may drop for non-English or mixed-language texts.

	## Acknowledgements & License

	Thanks to the Kaggle community for sharing the Cleaned Toxic Comments dataset.

	Built using Hugging Face’s transformers & datasets libraries.

	License: [Apache-2.0]

	Contact & Feedback

	If you find issues, want improvements (e.g. support for other languages, finer toxicity categories), or want to collaborate, feel free to open an issue or contact me at yamenrafat132@gmail.com.