---
language:
- th
library_name: transformers
pipeline_tag: text-classification
tags:
- thai
- toxicity-detection
- hate-speech
- nlp
- text-classification
datasets:
- SEACrowd/thai_toxicity_tweet
metrics:
- accuracy
- f1
model-index:
- name: thai-toxic-classifier
  results: []
---

# Thai Toxic Classifier 🇹🇭

A Thai language toxicity detection model trained to classify whether a Thai sentence is **toxic** or **non-toxic**.

The model is intended for research and experimentation in **Thai NLP safety, moderation systems, and toxicity analysis**.

Repository:  
https://huggingface.co/mashironotdev/thai-toxic-classifier

---

# Model Details

## Model Description

This model performs **binary text classification** on Thai text:

| Label | Meaning |
|-----|-----|
| 0 | non-toxic |
| 1 | toxic |

Example:

| Text | Prediction |
|-----|-----|
| สวัสดีครับ | non-toxic |
| ขอบคุณมากครับ | non-toxic |
| มึงโง่หรือไง | toxic |
| ไอ้ควาย | toxic |

---

## Intended Use

This model is designed for:

- Thai toxicity detection research
- content moderation experiments
- NLP benchmarking
- Thai language safety evaluation

Possible downstream uses:

- chat moderation
- comment filtering
- social media toxicity analysis

---

## Out-of-Scope Use

This model **should not be used for:**

- legal moderation decisions
- automated punishment systems
- sensitive content governance without human oversight

---

# Training Data

The model was trained on Thai toxicity datasets including:

- Thai Toxicity Tweet dataset
- synthetic toxic Thai sentences
- Thai profanity word lists

The dataset contains Thai sentences labeled as **toxic** or **non-toxic**.

---

# Training Procedure

## Preprocessing

Typical preprocessing steps:

- Thai text normalization
- tokenization using the model tokenizer
- padding and truncation

---

## Training Configuration

Example configuration:

## Quick Usage

```python
# install dependencies
# pip install transformers torch

from transformers import pipeline

# load model from Hugging Face
classifier = pipeline(
    "text-classification",
    model="mashironotdev/thai-toxic-classifier"
)

# example inputs
texts = [
    "สวัสดีครับ",
    "ขอบคุณมากครับ",
    "มึงโง่หรือไง",
    "ไอ้ควาย"
]

# run inference
results = classifier(texts)

# print results
for text, result in zip(texts, results):
    print(text, "->", result)
```