How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="SinaLab/Offensive-Hebrew")
# Load model directly
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("SinaLab/Offensive-Hebrew")
model = AutoModel.from_pretrained("SinaLab/Offensive-Hebrew")
Quick Links

Hebrew Corpus

This corpus contains offensive language in Hebrew manually annotated. The data includes 15,881 tweets, labeled with one or more of five classes (abusive, hate, violence, pornographic, or non-offensive). The corpus is annonated manually by Arabic-Hebrew bilingual speakers.

https://arxiv.org/abs/2309.02724

Models

AlephBERT (https://huggingface.co/imvladikon/sentence-transformers-alephbert)

Github Repository

git clone https://github.com/SinaLab/OffensiveHebrew

You can download the data from the following GitGub link:

https://github.com/SinaLab/OffensiveHebrew/tree/main/data

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using SinaLab/Offensive-Hebrew 1

Collection including SinaLab/Offensive-Hebrew

Paper for SinaLab/Offensive-Hebrew