SinaLab
/

Offensive-Hebrew

Text Classification

feature-extraction

text-embeddings-inference

Model card Files Files and versions

Hebrew Corpus

This corpus contains offensive language in Hebrew manually annotated. The data includes 15,881 tweets, labeled with one or more of five classes (abusive, hate, violence, pornographic, or non-offensive). The corpus is annonated manually by Arabic-Hebrew bilingual speakers.

https://arxiv.org/abs/2309.02724

Models

AlephBERT (https://huggingface.co/imvladikon/sentence-transformers-alephbert)

Github Repository

git clone https://github.com/SinaLab/OffensiveHebrew

You can download the data from the following GitGub link:

https://github.com/SinaLab/OffensiveHebrew/tree/main/data

Downloads last month: 42

Space using SinaLab/Offensive-Hebrew 1

Collection including SinaLab/Offensive-Hebrew

Offensive Hebrew

17 items • Updated Jan 27, 2024 • 1

Paper for SinaLab/Offensive-Hebrew

Offensive Hebrew Corpus and Detection using BERT

Paper • 2309.02724 • Published Sep 6, 2023 • 1