MM Sentiment Intensity Model v1
This is a fine-tuned XLM-RoBERTa model for Myanmar sentiment analysis. It classifies text into 5 intensity levels, ranging from Very Negative to Very Positive.
Model Description
The model was trained on a custom Myanmar dataset specifically curated for sentiment detection. It utilizes a custom syllable breaking preprocessing step to handle Myanmar Unicode text effectively.
- Developed by: [Thuta Sann]
- Model Type: Text Classification
- Language: Myanmar (Burmese)
- Base Model:
facebookai/xlm-roberta-base
Classification Labels
| Label | Sentiment | Emoji |
|---|---|---|
| LABEL_0 | Very Negative | ๐ด |
| LABEL_1 | Negative | ๐ |
| LABEL_2 | Neutral | ๐ก |
| LABEL_3 | Positive | ๐ข |
| LABEL_4 | Very Positive | ๐ต |
How to Use
To use this model, you must apply the Syllable Breaking logic before passing text to the model.
import re
from transformers import pipeline
def myanmar_sylbreak(line):
pat = re.compile(r"((?<!แน)[แ-แก](?![แบแน])|[a-zA-Z0-9แฃแคแฅแฆแงแฉแชแฟแแแแ-แแแ!-/:-@[-`{-~\s])")
return pat.sub(r" \1", line).strip()
classifier = pipeline("text-classification", model="thutasann/mm_sentiment_model_v1")
raw_text = "แแฎแแฑแท แแฌแแฎแฅแแฏ แกแแแบแธแแฌแแฌแแแบ"
segmented_text = myanmar_sylbreak(raw_text)
result = classifier(segmented_text)
print(result)
Training Data
This model was trained on the Myanmar Sentiment Intensity Dataset v1, which is based on research from the Myanmar NLP community.
Acknowledgments
- Base model by FacebookAI (XLM-RoBERTa).
- Syllable breaking logic based on sylbreak.
- Dataset inspiration from chuuhtetnaing/myanmar-text-segmentation-dataset.
- Downloads last month
- 28
Model tree for thutasann/mm_sentiment_model_v1
Base model
FacebookAI/xlm-roberta-base