Youssef-El-SaYed
/

toxic-comment-classifier

Text Classification

text-embeddings-inference

Model card Files Files and versions

Metrics Training metrics Community

toxic-comment-classifier / README.md

Youssef-El-SaYed's picture

Youssef-El-SaYed

Update README.md

e192b60 verified 4 months ago

|

history blame contribute delete

1.5 kB

	---
	license: mit
	datasets:
	- thesofakillers/jigsaw-toxic-comment-classification-challenge
	language:
	- en
	metrics:
	- accuracy
	- f1
	tags:
	- text-classification
	- toxic_comment
	- nlp
	- transformers
	- distilbert
	pipeline_tag: text-classification
	---
	# Toxic Comment Classifier (Distil-bert-uncased)

	This model is a fine-tuned Distil-bert-uncased model for toxic comment classification.
	It classifies comments as either toxic or non-toxic.


	## Training

	The model was trained using Hugging Face `Trainer` on a labeled toxic comment dataset.
	Evaluation metrics:

	- Accuracy: ~97%
	- F1 score: ~83%

	## Intended Use

	- Detecting toxic or harmful language in text.
	- Useful for moderation in forums, social media, and chat systems.

	## Limitations

	- May not capture sarcasm or subtle toxicity.
	- Biases in the training dataset may affect predictions.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

	model_id = "Youssef-El-SaYed/toxic-comment-classifier"

	# Define mapping
	id2label = {0: "Non-Toxic", 1: "Toxic"}
	label2id = {"Non-Toxic": 0, "Toxic": 1}

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForSequenceClassification.from_pretrained(
	model_id,
	id2label=id2label,
	label2id=label2id
	)

	nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)

	print(nlp("You are so stupid and annoying!"))
	print(nlp("I really like your work, keep it up!"))