Exqrch
/

IndoBERTweet-Profanity

Text Classification

Model card Files Files and versions

Exqrch commited on Aug 16, 2024

Commit

c1c4b6e

·

verified ·

1 Parent(s): dbad3d6

Update README.md

Files changed (1) hide show

README.md +58 -3

README.md CHANGED Viewed

@@ -1,3 +1,58 @@
----
-license: cc-by-sa-4.0
----

+---
+license: cc-by-sa-4.0
+---
+# IndoBERTweet-Profanity
+## Model Description
+IndoBERTweet fine-tuned on IndoToxic2024 dataset, with an accuracy of 0.81 and macro-F1 of 0.70. Performances are obtained through stratified 10-fold cross-validation.
+## Supported Tokenizer
+- **indolem/indobertweet-base-uncased**
+## Example Code
+```python
+import torch
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+# Specify the model and tokenizer name
+model_name = "Exqrch/IndoBERTweet-Profanity"
+tokenizer_name = "indolem/indobertweet-base-uncased"
+# Load the pre-trained model
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# Load the tokenizer
+tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
+text = "selamat pagi semua!"
+output = model(**tokenizer(text, return_tensors="pt"))
+logits = output.logits
+# Get the predicted class label
+predicted_class = torch.argmax(logits, dim=-1).item()
+print(predicted_class)
+--- Output ---
+> 0
+--- End of Output ---
+```
+## Limitations
+Trained only on Indonesian texts. No information on code-switched text performance.
+## Sample Output
+```
+Model name: Exqrch/IndoBERTweet-Profanity
+Text 1: aku butuh bantuan nih buat belajar, pc yang ingin bantu
+Prediction: 0
+Text 2: sumpah, tolol banget dah anjing ini matkul
+Prediction: 1
+```
+## Citation
+If used, please cite:
+```
+@article{susanto2024indotoxic2024,
+      title={IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language},
+      author={Lucky Susanto and Musa Izzanardi Wijanarko and Prasetia Anugrah Pratama and Traci Hong and Ika Idris and Alham Fikri Aji and Derry Wijaya},
+      year={2024},
+      eprint={2406.19349},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2406.19349},
+}
+```