--- license: cc-by-4.0 datasets: - DSL-13-SRMAP/Telugu-Dataset language: - te tags: - sentiment-analysis - text-classification - telugu - indic-languages - muril - rationale-supervision - explainable-ai base_model: google/muril-base-cased pipeline_tag: text-classification metrics: - accuracy - f1 - auroc --- # MuRIL_WR ## Model Description **MuRIL_WR** is a Telugu sentiment classification model built on **MuRIL (Multilingual Representations for Indian Languages)**, a Transformer-based BERT model specifically designed for **Indian languages**, including Telugu and English. MuRIL is pretrained on a **large and diverse corpus of Indian language text**, including web data, religious scriptures, and news content. In contrast to general multilingual models such as mBERT and XLM-R, MuRIL is better suited to capture **Telugu morphology, syntax, and linguistic structure**. The suffix **WR** denotes **With Rationale supervision**. This model is fine-tuned using both **sentiment labels and human-annotated rationales**, enabling improved alignment between model predictions and human-identified evidence. --- ## Pretraining Details - **Pretraining corpus:** Indian language text from web sources, religious texts, and news data - **Training objectives:** - Masked Language Modeling (MLM) - Translation Language Modeling (TLM) - **Language coverage:** 17+ Indian languages, including Telugu and English --- ## Training Data - **Fine-tuning dataset:** Telugu-Dataset - **Task:** Sentiment classification - **Supervision type:** Label + rationale supervision - **Rationales:** Token-level human-annotated evidence spans --- ## Rationale Supervision During fine-tuning, **human-provided rationales** guide model learning. Alongside the standard classification loss, an **auxiliary rationale loss** encourages the model’s attention or explanation scores to align with annotated rationale tokens. This approach improves: - Interpretability of sentiment predictions - Alignment between model explanations and human judgment - Plausibility of generated explanations --- ## Intended Use This model is intended for: - Explainable Telugu sentiment classification - Rationale-supervised learning experiments - Indian-language explainability research - Comparative evaluation against label-only (WOR) baselines MuRIL_WR is particularly effective for **informal, conversational, and social media Telugu text**, where rationale supervision further enhances explanation quality. --- ## Performance Characteristics Compared to label-only training, rationale supervision typically improves **explanation plausibility** while maintaining competitive sentiment classification performance. ### Strengths - Strong Telugu-specific linguistic modeling - Human-aligned explanations via rationale supervision - Suitable for explainable AI benchmarking in Indian languages ### Limitations - Requires human-annotated rationales, increasing annotation effort - Pretraining data bias toward informal text may affect formal Telugu tasks - Classification gains over WOR may be modest --- ## Use in Explainability Evaluation **MuRIL_WR** is well-suited for evaluation with explanation frameworks such as FERRET, enabling: - **Faithfulness evaluation:** How well explanations support model predictions - **Plausibility evaluation:** How closely explanations align with human rationales --- ## References - Khanuja et al., 2021 - Joshi, 2022 - Das et al., 2022 - Rajalakshmi et al., 2023