MasterShomya's picture
Update README.md
29deaf7 verified
---
title: Tweets Sentiment Analyzer
emoji: πŸš€
colorFrom: indigo
colorTo: gray
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
license: mit
short_description: Real-Time Tweet Sentiment Analyzer
---
# 🧠 Sentiment Analysis from Scratch (BiLSTM + Attention)
Welcome to this live interactive demo of a sentiment analysis model trained completely from scratch using a **Deep Bidirectional LSTM** architecture enhanced with a **custom attention mechanism**. This project is designed to classify short texts or tweets into **Positive** or **Negative** sentiments with a confidence score.
---
## πŸ“Œ Project Highlights
- βœ… **Trained from scratch**: The embedding layer is trained on the dataset itself (not using pretrained embeddings).
- 🧠 **Model Architecture**:
- Bidirectional LSTM layers
- Custom attention layer (`BetterAttention`)
- Final dense ANN for binary classification
- πŸ“Š **Output**: Label (Positive/Negative) and confidence score (0–1)
- πŸ”  **Tokenizer**: Also trained from scratch and saved as `tokenizer.joblib`
- πŸ“ **Model Format**: Saved as `.keras` and loaded efficiently during inference
---
## πŸš€ Try it Out
Enter a tweet or short sentence below and see real-time prediction:
πŸ‘‰ *Example*:
`"I absolutely loved the performance!"`
**Output**: Positive (0.91)
---
## πŸ›  Model Files
You can also explore/download the trained artifacts here:
- [`sentiment_model.keras`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/sentiment_model.keras)
- [`tokenizer.joblib`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/tokenizer.joblib)
---
## πŸ§ͺ How It Works
1. The input text is tokenized using the trained tokenizer (`joblib`).
2. The padded sequence is passed through:
- `Embedding β†’ BiLSTM β†’ BiLSTM β†’ Attention β†’ Dense Layers`
3. The final sigmoid-activated output represents the **probability of positivity**.
4. A confidence-aware label is returned using Gradio’s `Label` component.
---
## πŸ“ˆ Model Performance
Despite training from scratch without pretrained embeddings (like GloVe or FastText), the model performs comparably well. Experiments with `glove.27B.200d` embeddings yielded **similar accuracy**, and hence were excluded for clarity.
Training plots and confusion matrix are available in the original [Kaggle Notebook](https://www.kaggle.com/code/mastershomya/sentiment-analysis-deep-bilstm).
---
## πŸ§‘β€πŸ’» Author
**Shomya Soneji**
Machine Learning & Deep Learning Enthusiast
Connect on [Kaggle](https://www.kaggle.com/mastershomya)
---
## 🀝 Support
If you find this project helpful, please consider giving it a 🌟 and sharing it!
Your feedback and suggestions are always welcome πŸ’¬