MM Sentiment Intensity Model v1

This is a fine-tuned XLM-RoBERTa model for Myanmar sentiment analysis. It classifies text into 5 intensity levels, ranging from Very Negative to Very Positive.

Model Description

The model was trained on a custom Myanmar dataset specifically curated for sentiment detection. It utilizes a custom syllable breaking preprocessing step to handle Myanmar Unicode text effectively.

  • Developed by: [Thuta Sann]
  • Model Type: Text Classification
  • Language: Myanmar (Burmese)
  • Base Model: facebookai/xlm-roberta-base

Classification Labels

Label Sentiment Emoji
LABEL_0 Very Negative ๐Ÿ”ด
LABEL_1 Negative ๐ŸŸ 
LABEL_2 Neutral ๐ŸŸก
LABEL_3 Positive ๐ŸŸข
LABEL_4 Very Positive ๐Ÿ”ต

How to Use

To use this model, you must apply the Syllable Breaking logic before passing text to the model.

import re
from transformers import pipeline

def myanmar_sylbreak(line):
    pat = re.compile(r"((?<!แ€น)[แ€€-แ€ก](?![แ€บแ€น])|[a-zA-Z0-9แ€ฃแ€คแ€ฅแ€ฆแ€งแ€ฉแ€ชแ€ฟแŒแแแ€-แ‰แŠแ‹!-/:-@[-`{-~\s])")
    return pat.sub(r" \1", line).strip()

classifier = pipeline("text-classification", model="thutasann/mm_sentiment_model_v1")

raw_text = "แ€’แ€ฎแ€”แ€ฑแ€ท แ€›แ€ฌแ€žแ€ฎแ€ฅแ€แ€ฏ แ€กแ€›แ€™แ€บแ€ธแ€žแ€ฌแ€šแ€ฌแ€แ€šแ€บ"
segmented_text = myanmar_sylbreak(raw_text)
result = classifier(segmented_text)
print(result)

Training Data

This model was trained on the Myanmar Sentiment Intensity Dataset v1, which is based on research from the Myanmar NLP community.

Acknowledgments

Downloads last month
28
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for thutasann/mm_sentiment_model_v1

Finetuned
(3742)
this model

Space using thutasann/mm_sentiment_model_v1 1