Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-4.0
|
| 3 |
+
tags:
|
| 4 |
+
- sentiment-classification
|
| 5 |
+
- telugu
|
| 6 |
+
- xlm-r
|
| 7 |
+
- multilingual
|
| 8 |
+
- baseline
|
| 9 |
+
language: te
|
| 10 |
+
datasets:
|
| 11 |
+
- DSL-13-SRMAP/TeSent_Benchmark-Dataset
|
| 12 |
+
model_name: XLM-R_WR
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# XLM-R_WR: XLM-RoBERTa Telugu Sentiment Classification Model (With Rationale)
|
| 16 |
+
|
| 17 |
+
## Model Overview
|
| 18 |
+
|
| 19 |
+
**XLM-R_WR** is a Telugu sentiment classification model based on **XLM-RoBERTa (XLM-R)**, a general-purpose multilingual transformer developed by Facebook AI.
|
| 20 |
+
The "WR" in the model name stands for "**With Rationale**", indicating that this model is trained using both sentiment labels and **human-annotated rationales** from the TeSent_Benchmark-Dataset.
|
| 21 |
+
|
| 22 |
+
---
|
| 23 |
+
|
| 24 |
+
## Model Details
|
| 25 |
+
|
| 26 |
+
- **Architecture:** XLM-RoBERTa (transformer-based, multilingual)
|
| 27 |
+
- **Pretraining Data:** 2.5TB of filtered Common Crawl data across 100+ languages, including Telugu
|
| 28 |
+
- **Pretraining Objective:** Masked Language Modeling (MLM), no Next Sentence Prediction (NSP)
|
| 29 |
+
- **Fine-tuning Data:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset), using both sentence-level sentiment labels and rationale annotations
|
| 30 |
+
- **Task:** Sentence-level sentiment classification (3-way)
|
| 31 |
+
- **Rationale Usage:** **Used** during training and/or inference ("WR" = With Rationale)
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
## Intended Use
|
| 36 |
+
|
| 37 |
+
- **Primary Use:** Benchmarking Telugu sentiment classification on the TeSent_Benchmark-Dataset, especially as a **baseline** for models trained with and without rationales
|
| 38 |
+
- **Research Setting:** Suitable for cross-lingual and multilingual NLP research, as well as explainable AI in low-resource settings
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
+
|
| 42 |
+
## Why XLM-R?
|
| 43 |
+
|
| 44 |
+
XLM-R is designed for cross-lingual understanding and contextual modeling, providing strong transfer learning capabilities and improved downstream performance compared to mBERT. When fine-tuned with local Telugu data, XLM-R delivers solid results for sentiment analysis.
|
| 45 |
+
However, Telugu-specific models like MuRIL or L3Cube-Telugu-BERT may offer better cultural and linguistic alignment for purely Telugu tasks.
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
+
## Performance and Limitations
|
| 50 |
+
|
| 51 |
+
**Strengths:**
|
| 52 |
+
- Strong transfer learning and contextual modeling for multilingual NLP
|
| 53 |
+
- Good performance for Telugu sentiment analysis when fine-tuned with local data
|
| 54 |
+
- Provides **explicit rationales** for predictions, aiding explainability
|
| 55 |
+
- Useful as a cross-lingual and multilingual baseline
|
| 56 |
+
|
| 57 |
+
**Limitations:**
|
| 58 |
+
- May be outperformed by Telugu-specific models for culturally nuanced tasks
|
| 59 |
+
- Requires sufficient labeled Telugu data and rationale annotations for best performance
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
+
|
| 63 |
+
## Training Data
|
| 64 |
+
|
| 65 |
+
- **Dataset:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset)
|
| 66 |
+
- **Data Used:** The **Content** (Telugu sentence), **Label** (sentiment label), and **Rationale** (human-annotated rationale) columns are used for XLM-R_WR training
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
## Language Coverage
|
| 71 |
+
|
| 72 |
+
- **Language:** Telugu (`te`)
|
| 73 |
+
- **Model Scope:** This implementation and evaluation focus strictly on Telugu sentiment classification
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## Citation and More Details
|
| 78 |
+
|
| 79 |
+
For detailed experimental setup, evaluation metrics, and comparisons with rationale-based models, **please refer to our paper**.
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
---
|
| 84 |
+
|
| 85 |
+
## License
|
| 86 |
+
|
| 87 |
+
Released under [CC BY 4.0](LICENSE).
|