Text Classification
Transformers
Safetensors
Thai
xlm-roberta
thai
toxicity-detection
hate-speech
nlp
text-embeddings-inference
Instructions to use mashironotdev/thai-toxic-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mashironotdev/thai-toxic-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="mashironotdev/thai-toxic-classifier")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("mashironotdev/thai-toxic-classifier") model = AutoModelForSequenceClassification.from_pretrained("mashironotdev/thai-toxic-classifier") - Notebooks
- Google Colab
- Kaggle
File size: 2,542 Bytes
c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 6eed8a0 c9d1770 f852f16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | ---
language:
- th
library_name: transformers
pipeline_tag: text-classification
tags:
- thai
- toxicity-detection
- hate-speech
- nlp
- text-classification
datasets:
- SEACrowd/thai_toxicity_tweet
metrics:
- accuracy
- f1
model-index:
- name: thai-toxic-classifier
results: []
---
# Thai Toxic Classifier 🇹🇭
A Thai language toxicity detection model trained to classify whether a Thai sentence is **toxic** or **non-toxic**.
The model is intended for research and experimentation in **Thai NLP safety, moderation systems, and toxicity analysis**.
Repository:
https://huggingface.co/mashironotdev/thai-toxic-classifier
---
# Model Details
## Model Description
This model performs **binary text classification** on Thai text:
| Label | Meaning |
|-----|-----|
| 0 | non-toxic |
| 1 | toxic |
Example:
| Text | Prediction |
|-----|-----|
| สวัสดีครับ | non-toxic |
| ขอบคุณมากครับ | non-toxic |
| มึงโง่หรือไง | toxic |
| ไอ้ควาย | toxic |
---
## Intended Use
This model is designed for:
- Thai toxicity detection research
- content moderation experiments
- NLP benchmarking
- Thai language safety evaluation
Possible downstream uses:
- chat moderation
- comment filtering
- social media toxicity analysis
---
## Out-of-Scope Use
This model **should not be used for:**
- legal moderation decisions
- automated punishment systems
- sensitive content governance without human oversight
---
# Training Data
The model was trained on Thai toxicity datasets including:
- Thai Toxicity Tweet dataset
- synthetic toxic Thai sentences
- Thai profanity word lists
The dataset contains Thai sentences labeled as **toxic** or **non-toxic**.
---
# Training Procedure
## Preprocessing
Typical preprocessing steps:
- Thai text normalization
- tokenization using the model tokenizer
- padding and truncation
---
## Training Configuration
Example configuration:
## Quick Usage
```python
# install dependencies
# pip install transformers torch
from transformers import pipeline
# load model from Hugging Face
classifier = pipeline(
"text-classification",
model="mashironotdev/thai-toxic-classifier"
)
# example inputs
texts = [
"สวัสดีครับ",
"ขอบคุณมากครับ",
"มึงโง่หรือไง",
"ไอ้ควาย"
]
# run inference
results = classifier(texts)
# print results
for text, result in zip(texts, results):
print(text, "->", result)
``` |