--- title: Tweets Sentiment Analyzer emoji: πŸš€ colorFrom: indigo colorTo: gray sdk: gradio sdk_version: 5.34.2 app_file: app.py pinned: false license: mit short_description: Real-Time Tweet Sentiment Analyzer --- # 🧠 Sentiment Analysis from Scratch (BiLSTM + Attention) Welcome to this live interactive demo of a sentiment analysis model trained completely from scratch using a **Deep Bidirectional LSTM** architecture enhanced with a **custom attention mechanism**. This project is designed to classify short texts or tweets into **Positive** or **Negative** sentiments with a confidence score. --- ## πŸ“Œ Project Highlights - βœ… **Trained from scratch**: The embedding layer is trained on the dataset itself (not using pretrained embeddings). - 🧠 **Model Architecture**: - Bidirectional LSTM layers - Custom attention layer (`BetterAttention`) - Final dense ANN for binary classification - πŸ“Š **Output**: Label (Positive/Negative) and confidence score (0–1) - πŸ”  **Tokenizer**: Also trained from scratch and saved as `tokenizer.joblib` - πŸ“ **Model Format**: Saved as `.keras` and loaded efficiently during inference --- ## πŸš€ Try it Out Enter a tweet or short sentence below and see real-time prediction: πŸ‘‰ *Example*: `"I absolutely loved the performance!"` **Output**: Positive (0.91) --- ## πŸ›  Model Files You can also explore/download the trained artifacts here: - [`sentiment_model.keras`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/sentiment_model.keras) - [`tokenizer.joblib`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/tokenizer.joblib) --- ## πŸ§ͺ How It Works 1. The input text is tokenized using the trained tokenizer (`joblib`). 2. The padded sequence is passed through: - `Embedding β†’ BiLSTM β†’ BiLSTM β†’ Attention β†’ Dense Layers` 3. The final sigmoid-activated output represents the **probability of positivity**. 4. A confidence-aware label is returned using Gradio’s `Label` component. --- ## πŸ“ˆ Model Performance Despite training from scratch without pretrained embeddings (like GloVe or FastText), the model performs comparably well. Experiments with `glove.27B.200d` embeddings yielded **similar accuracy**, and hence were excluded for clarity. Training plots and confusion matrix are available in the original [Kaggle Notebook](https://www.kaggle.com/code/mastershomya/sentiment-analysis-deep-bilstm). --- ## πŸ§‘β€πŸ’» Author **Shomya Soneji** Machine Learning & Deep Learning Enthusiast Connect on [Kaggle](https://www.kaggle.com/mastershomya) --- ## 🀝 Support If you find this project helpful, please consider giving it a 🌟 and sharing it! Your feedback and suggestions are always welcome πŸ’¬