archich
/

reddit-hate-speech-detector

hate-speech-detection

Model card Files Files and versions

reddit-hate-speech-detector / README.md

archich's picture

Upload folder using huggingface_hub

172c327 verified 2 months ago

|

history blame contribute delete

1.55 kB

	---
	language:
	- en
	- hi
	license: apache-2.0
	tags:
	- hate-speech-detection
	- reddit
	- xlm-roberta
	- hindi
	- english
	datasets:
	- HASOC2019
	metrics:
	- accuracy
	- f1
	model-index:
	- name: reddit-hate-speech-detector
	results:
	- task:
	type: text-classification
	metrics:
	- type: accuracy
	value: 0.8293
	- type: f1
	value: 0.8278
	---

	# Reddit Hate Speech Detector (Hindi + English)

	This model detects hate speech in Reddit comments for both Hindi and English languages.

	## Model Description

	- Base Model: XLM-RoBERTa
	- Languages: Hindi, English
	- Task: Multi-task classification (hate speech detection + type + target)
	- Accuracy: 82.93%
	- F1 Score: 0.8278

	## Intended Use

	This model is designed for:
	- Content moderation on Reddit
	- Automated hate speech detection
	- Research purposes

	⚠️ Important: This model should assist human moderators, not replace them.

	## Usage

	```python
	import torch
	from transformers import XLMRobertaTokenizer

	# Load tokenizer
	tokenizer = XLMRobertaTokenizer.from_pretrained('xlm-roberta-base')

	# Your model loading code here
	# (See inference script)
	```

	## Training Data

	- HASOC 2019 Hindi Dataset
	- HASOC 2019 English Dataset
	- Combined training with class balancing

	## Limitations

	- May have biases present in training data
	- Requires context for accurate detection
	- Cultural nuances may not be fully captured

	## Ethical Considerations

	- Should be used transparently
	- Allow user appeals
	- Regular monitoring for fairness
	- Consider cultural context