haipradana
/

roberta-hate-classification-model

Model card Files Files and versions

roberta-hate-classification-model / README.md

haipradana's picture

Update README.md

6efe33f verified 6 months ago

|

history blame contribute delete

1.56 kB

	---
	license: mit
	datasets:
	- haipradana/indonesian-twitter-hate-speech-cleaned
	language:
	- id
	tags:
	- bert
	- RoBERTa
	- tweet
	- hate
	- twitter
	base_model:
	- cardiffnlp/twitter-roberta-base-sentiment-latest
	---

	# Fine-tuned RoBERTa pre-trained model to classify Indonesian hate tweet(s)

	Just check GitHub for full-code and Google Colab: https://github.com/haipradana/RoBERTa-Indonesian-Hate-Tweet-Classification/tree/main

	This project fine-tunes a RoBERTa model from [cardiffnlp/twitter-roberta-base-sentiment-latest](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) to classify Indonesian tweets as either neutral or hate speech.

	## How to use this model?


	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model
	tokenizer = AutoTokenizer.from_pretrained('./model')
	model = AutoModelForSequenceClassification.from_pretrained('./model')

	# Predict
	def predict(text):
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=511)
	with torch.no_grad():
	outputs = model(**inputs)
	prediction = torch.argmax(outputs.logits, dim=1).item()
	return 'hate' if prediction == 1 else 'neutral'

	# Example
	result = predict("Paru-parumu terbuat dari batu ya? udah sakit gini masih aja merokok!")
	print(result) # Output: hate
	```

	### Or just using the script in the GitHub Repos

	```bash
	cd scripts
	python predict.py
	```
	## Performance Metrics

	```
	Accuracy: 82.01%
	Precision: 82.68%
	Recall: 81.72%
	F1-Score: 82.19%
	```