arxiv:2104.01443

Sexism detection: The first corpus in Algerian dialect with a code-switching in Arabic/ French and English

Published on Apr 3, 2021

Authors:

Abstract

A deep learning approach for detecting hate speech against women in Arabic social media content is presented, with a new annotated corpus and comparative analysis of CNN, LSTM, and Bi-LSTM architectures.

AI-generated summary

In this paper, an approach for hate speech detection against women in Arabic community on social media (e.g. Youtube) is proposed. In the literature, similar works have been presented for other languages such as English. However, to the best of our knowledge, not much work has been conducted in the Arabic language. A new hate speech corpus (Arabic\_fr\_en) is developed using three different annotators. For corpus validation, three different machine learning algorithms are used, including deep Convolutional Neural Network (CNN), long short-term memory (LSTM) network and Bi-directional LSTM (Bi-LSTM) network. Simulation results demonstrate the best performance of the CNN model, which achieved F1-score up to 86\% for the unbalanced corpus as compared to LSTM and Bi-LSTM.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2104.01443 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2104.01443 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2104.01443 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.