Text Classification
Transformers
PyTorch
English
bert
hate-speech
hate-speech-detection
text-embeddings-inference
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("ctoraman/hate-speech-bert")
model = AutoModelForSequenceClassification.from_pretrained("ctoraman/hate-speech-bert")Quick Links
hate-speech-bert (base-uncased)
Fine-tuned hate speech detection model on English language using Toraman22 v2 dataset published at https://github.com/metunlp/hate-speech
Class labels:
"0": Neutral
"1": Offensive
"2": Hate
We split 80-20 train-test randomly.
5 epochs train_loss: 0.0948
eval_f1 is 0.9426, eval_accuracy is 0.9430
BibTeX entry and citation info
@InProceedings{toraman2022large,
author = {Toraman, Cagri and \c{S}ahinu\c{c}, Furkan and Yilmaz, Eyup Halit},
title = {Large-Scale Hate Speech Detection with Cross-Domain Transfer},
booktitle = {Proceedings of the Language Resources and Evaluation Conference},
month = {June},
year = {2022},
address = {Marseille, France},
publisher = {European Language Resources Association},
pages = {2215--2225},
url = {https://aclanthology.org/2022.lrec-1.238}
}
- Downloads last month
- 27
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="ctoraman/hate-speech-bert")