Yash22CSU192
/

davidson-roberta-hatespeech

Text Classification

text-embeddings-inference

Model card Files Files and versions

davidson-roberta-hatespeech / README.md

Yash22CSU192's picture

Update README.md

ca7f792 verified 6 months ago

|

history blame contribute delete

1.31 kB

	---
	license: mit
	datasets:
	- tdavidson/hate_speech_offensive
	base_model:
	- FacebookAI/roberta-large
	pipeline_tag: text-classification
	library_name: transformers
	---
	# Davidson RoBERTa Hate Speech Classifier

	- Model: roberta-large fine-tuned for 3-way classification (toxic, neutral, non-toxic).
	- Dataset: tdavidson/hate_speech_offensive (Twitter), split into train/val/test locally.
	- Metrics (test): paste from metrics.json.
	- Intended use: content moderation research/demos; not for deployment without bias/fairness review.
	- Limitations/risks: social bias, dataset age/domain mismatch; errors possible on slang/irony.
	- How to use:
	## Usage
	from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

	mid = "Yash22CSU192/davidson-roberta-hatespeech"

	# Load tokenizer and model
	tok = AutoTokenizer.from_pretrained(mid)
	mdl = AutoModelForSequenceClassification.from_pretrained(mid)

	# Create a text-classification pipeline
	clf = pipeline("text-classification", model=mdl, tokenizer=tok, return_all_scores=True)

	# Test the classifier
	print(clf("Have a nice day."))



	## Files
	- model.safetensors, config.json, tokenizer.json, tokenizer_config.json, vocab.json, merges.txt, special_tokens_map.json
	- training_args.bin (Trainer settings), metrics.json (evaluation summary)