MM Sentiment Intensity Model v1

This is a fine-tuned XLM-RoBERTa model for Myanmar sentiment analysis. It classifies text into 5 intensity levels, ranging from Very Negative to Very Positive.

Model Description

The model was trained on a custom Myanmar dataset specifically curated for sentiment detection. It utilizes a custom syllable breaking preprocessing step to handle Myanmar Unicode text effectively.

Developed by: [Thuta Sann]
Model Type: Text Classification
Language: Myanmar (Burmese)
Base Model: facebookai/xlm-roberta-base

Classification Labels

Label	Sentiment	Emoji
LABEL_0	Very Negative	🔴
LABEL_1	Negative	🟠
LABEL_2	Neutral	🟡
LABEL_3	Positive	🟢
LABEL_4	Very Positive	🔵

How to Use

To use this model, you must apply the Syllable Breaking logic before passing text to the model.

import re
from transformers import pipeline

def myanmar_sylbreak(line):
    pat = re.compile(r"((?<!္)[က-အ](?![်္])|[a-zA-Z0-9ဣဤဥဦဧဩဪဿ၌၍၏၀-၉၊။!-/:-@[-`{-~\s])")
    return pat.sub(r" \1", line).strip()

classifier = pipeline("text-classification", model="thutasann/mm_sentiment_model_v1")

raw_text = "ဒီနေ့ ရာသီဥတု အရမ်းသာယာတယ်"
segmented_text = myanmar_sylbreak(raw_text)
result = classifier(segmented_text)
print(result)

Training Data

This model was trained on the Myanmar Sentiment Intensity Dataset v1, which is based on research from the Myanmar NLP community.

Acknowledgments

Base model by FacebookAI (XLM-RoBERTa).
Syllable breaking logic based on sylbreak.
Dataset inspiration from chuuhtetnaing/myanmar-text-segmentation-dataset.

Downloads last month: 3

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for thutasann/mm_sentiment_model_v1

Base model

FacebookAI/xlm-roberta-base

Finetuned

(4081)

this model

thutasann
/

mm_sentiment_model_v1