File size: 4,290 Bytes

---
language: as
tags:
  - sentiment-analysis
  - assamese
  - transformers
  - text-classification
license: apache-2.0
datasets:
  - None
model-index:
  - name: assamese-sentiment-analysis
    results: []
---


# 🌟 Assamese Sentiment Analysis with LSTM  
**Tags:** `#text-classification` `#sentiment-analysis` `#Assamese` `#LSTM`

> A deep learning-powered tool to classify Assamese text as **Positive**, **Negative**, or **Neutral** using an LSTM model tailored for the Assamese language.  

---

## 🚀 Key Features

- 🔍 **Sentiment Analysis for Assamese** – Supports full sentiment classification of Assamese text  
- 🧠 **Deep Learning Backbone** – Powered by TensorFlow/Keras with a Long Short-Term Memory (LSTM) network  
- ✨ **Advanced Preprocessing** – Includes tokenization, text cleaning, optional stemming, and stopword removal  
- 🧰 **Custom Tokenization** – Leverages [AssameseTokenizer](https://github.com/KashyapKishore/AssameseTokenizer.git) for accurate language handling  
- 📈 **Robust Evaluation Metrics** – F1-score, precision, recall, and accuracy  

---

## 🧠 Model Overview

| Property            | Details                                         |
|---------------------|--------------------------------------------------|
| **Model Name**      | `pratyushee/assamese-sentiment-analysis`         |
| **Architecture**    | Pretrained LSTM-based neural network             |
| **Language**        | Assamese (অসমীয়া)                               |
| **Classes**         | 3 – Positive, Neutral, Negative                  |
| **Use Cases**       | Customer feedback, social media monitoring, opinion mining |

---

## 🧪 Installation & Requirements

Clone the repo and install the requirements:

```bash
pip install -r requirements.txt
```

Install the custom Assamese tokenizer:

```bash
git clone https://github.com/KashyapKishore/AssameseTokenizer.git
cd AssameseTokenizer
pip install .
```
-----

## ⚙️ Model Description
This model was developed using Assamese text data and trained with a custom tokenizer specifically designed for Assamese script. It uses an LSTM architecture, making it well-suited for capturing the sequence and context of natural language in sentiment classification tasks.

- 📚 Training Data
The dataset was curated from public sources such as news articles, social media comments, and feedback forms, and was manually labeled into three sentiment classes: Positive, Neutral, and Negative.

- 🏋️ Training Procedure
- ✂️ Preprocessing: Text cleaning, tokenization using AssameseTokenizer, optional stemming and stopword removal

- 🔢 Input Handling: Sequences padded or truncated to a fixed length of 512 tokens

- 🧠 Architecture: Embedding layer → LSTM → Dense (Softmax)

- 💧 Regularization: Dropout layers to prevent overfitting

- ⚙️ Optimizer: Adam

- 🔁 Epochs: Trained for X epochs (replace with your actual number)

- 📊 Evaluation: Final validation accuracy and F1-score: Insert actual metrics here

---

## 📦 Intended Usage
Ideal for:

- 🗨️ Social media sentiment tracking in Assamese

- 📢 Public opinion & brand monitoring

- 📚 Research on low-resource NLP in Indic languages

- ⚠️ Limitations / Not Recommended For:

Code-mixed Assamese-English input

Domain-specific texts (e.g., legal, medical) without additional fine-tuning

---

## 🧪 Quickstart: Using the Model

You can load and run the model easily via Hugging Face's transformers pipeline:

```bash
from transformers import pipeline

model_name = "pratyushee/assamese-sentiment-analysis"
pipe = pipeline("text-classification", model=model_name, tokenizer=model_name)

result = pipe("এই খাবাৰটা একদম ভালো আছিল!")  # Sample Assamese sentence
print(result)
```

----
## 📚 Reference Citations

- [E. Grave*, P. Bojanowski*, P. Gupta, A. Joulin, T. Mikolov, Learning Word Vectors for 157 Languages](https://arxiv.org/abs/1802.06893)


- [Assamese Tokenizer](https://github.com/KashyapKishore/AssameseTokenizer.git) 

---
## 🤝 In Collaboration with

- [Angshita Kashyap](https://huggingface.co/angshita)

  
- [Dhiraj Ballav Saikia](https://huggingface.co/dhiraj04)

  
- [Niharika Nath](https://huggingface.co/niharikanath)