bartpho-hsd / README.md

Upload README.md with huggingface_hub

aa2a5c8 verified 18 days ago

5.9 kB

	---
	language:
	- vi
	tags:
	- hate-speech-detection
	- vietnamese-nlp
	- text-classification
	- offensive-language-detection
	license: mit
	datasets:
	- vihsd
	base_model: vinai/bartpho-syllable-base
	---

	# BARTpho

	BARTpho fine-tuned cho bài toán phân loại Hate Speech tiếng Việt

	## Model Details

	### Model Type
	BARTpho (Bidirectional and Auto-Regressive Transformer cho tiếng Việt)

	### Base Model
	This model is fine-tuned from [vinai/bartpho-syllable-base](https://huggingface.co/vinai/bartpho-syllable-base)

	### Training Info
	- Task: Hate Speech Classification
	- Language: Vietnamese
	- Labels:
	- `0`: CLEAN (Normal content)
	- `1`: OFFENSIVE (Mildly offensive content)
	- `2`: HATE (Hate speech)

	## 📊 Model Performance

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| 0.8985 \|
	\| F1 Macro \| 0.6791 \|
	\| F1 Weighted \| 0.8886 \|

	## Model Description

	This model has been fine-tuned on the ViHSD (Vietnamese Hate Speech Dataset) to classify Vietnamese text into three categories: CLEAN, OFFENSIVE, and HATE.

	### Architecture
	BARTpho (Bidirectional and Auto-Regressive Transformer cho tiếng Việt)

	The model combines the powerful pretrained representations with task-specific fine-tuning for effective hate speech detection in Vietnamese social media content.

	## How to Use

	### 1. Using Transformers Pipeline

	```python
	from transformers import pipeline

	# Initialize the hate speech classifier
	classifier = pipeline(
	"text-classification",
	model="visolex/hate-speech-bartpho",
	tokenizer="visolex/hate-speech-bartpho",
	return_all_scores=True
	)

	# Classify text
	results = classifier("Văn bản tiếng Việt cần kiểm tra")
	print(results)
	```

	### 2. Using AutoModel

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model and tokenizer
	model_name = "visolex/hate-speech-bartpho"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	# Prepare text
	text = "Văn bản tiếng Việt cần kiểm tra"
	inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256)

	# Get predictions
	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits

	# Get probabilities
	probabilities = torch.nn.functional.softmax(logits, dim=-1)

	# Get predicted label
	predicted_label = torch.argmax(probabilities, dim=-1).item()
	confidence = probabilities[0][predicted_label].item()

	# Label mapping
	label_mapping = {
	0: "CLEAN",
	1: "OFFENSIVE",
	2: "HATE"
	}

	print(f"Predicted: {label_mapping[predicted_label]} (Confidence: {confidence:.2%})")
	```

	### 3. Batch Processing

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_name = "visolex/hate-speech-bartpho"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	# List of texts to classify
	texts = [
	"Bài viết rất hay và bổ ích",
	"Đồ ngu người ta nói đúng mà",
	"Cút đi đồ chó"
	]

	# Tokenize and predict
	inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=256)

	with torch.no_grad():
	outputs = model(**inputs)
	predictions = torch.argmax(outputs.logits, dim=-1)

	for text, pred in zip(texts, predictions):
	label = ["CLEAN", "OFFENSIVE", "HATE"][pred.item()]
	print(f"{text[:50]} -> {label}")
	```

	## Training Details

	### Training Data
	- Dataset: ViHSD (Vietnamese Hate Speech Detection Dataset)
	- Total samples: ~10,000 Vietnamese comments from social media
	- Training split: ~70%
	- Validation split: ~15%
	- Test split: ~15%

	### Training Configuration
	- Framework: PyTorch + HuggingFace Transformers
	- Optimizer: AdamW
	- Learning Rate: 2e-5
	- Batch Size: 32
	- Max Length: 256 tokens
	- Epochs: Optimized via early stopping

	### Preprocessing
	- Text normalization for Vietnamese
	- Special character handling
	- Emoji and slang processing

	## Evaluation Results

	Model evaluation metrics on the ViHSD test set: See Model Performance section above for details.

	### Label Distribution
	- CLEAN (0): Normal content without offensive language
	- OFFENSIVE (1): Mildly offensive or inappropriate content
	- HATE (2): Hate speech, extremist language, severe threats

	## Use Cases

	- Social Media Moderation: Automatic detection of hate speech in Vietnamese social media platforms
	- Content Filtering: Filtering offensive content in Vietnamese text
	- Research: Studying hate speech patterns in Vietnamese online communities

	## Limitations and Considerations

	⚠️ Important Limitations:
	- Model trained primarily on social media data, may not generalize to formal text
	- Performance may vary with slang, code-switching, or regional dialects
	- Model reflects biases present in training data
	- Should be used as part of a larger moderation system, not sole decision-maker

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@software{vihsd_bartpho,
	title = {BARTpho for Vietnamese Hate Speech Detection},
	author = {ViSoLex Team},
	year = {2024},
	url = {https://huggingface.co/visolex/hate-speech-bartpho},
	base_model = {vinai/bartpho-syllable-base}
	}
	```

	## Contact & Support

	- GitHub: [ViSoLex Hate Speech Detection](https://github.com/visolex/hate-speech-detection)
	- Issues: [Report Issues](https://github.com/visolex/hate-speech-detection/issues)
	- Questions: Open a discussion on the model's Hugging Face page

	## License

	This model is distributed under the MIT License.

	## Acknowledgments

	- Base model trained by vinai
	- Dataset: ViHSD (Vietnamese Hate Speech Detection Dataset)
	- Framework: [Hugging Face Transformers](https://huggingface.co/transformers)