Spaces:

MasterShomya
/

Sentiment_Analysis-Tweets

Sleeping

App Files Files Community

Sentiment_Analysis-Tweets / README.md

MasterShomya

Update README.md

29deaf7 verified 7 months ago

preview code

raw

history blame contribute delete

2.76 kB

	---
	title: Tweets Sentiment Analyzer
	emoji: 🚀
	colorFrom: indigo
	colorTo: gray
	sdk: gradio
	sdk_version: 5.34.2
	app_file: app.py
	pinned: false
	license: mit
	short_description: Real-Time Tweet Sentiment Analyzer
	---

	# 🧠 Sentiment Analysis from Scratch (BiLSTM + Attention)

	Welcome to this live interactive demo of a sentiment analysis model trained completely from scratch using a Deep Bidirectional LSTM architecture enhanced with a custom attention mechanism. This project is designed to classify short texts or tweets into Positive or Negative sentiments with a confidence score.

	---

	## 📌 Project Highlights

	- ✅ Trained from scratch: The embedding layer is trained on the dataset itself (not using pretrained embeddings).
	- 🧠 Model Architecture:
	- Bidirectional LSTM layers
	- Custom attention layer (`BetterAttention`)
	- Final dense ANN for binary classification
	- 📊 Output: Label (Positive/Negative) and confidence score (0–1)
	- 🔠 Tokenizer: Also trained from scratch and saved as `tokenizer.joblib`
	- 📁 Model Format: Saved as `.keras` and loaded efficiently during inference

	---

	## 🚀 Try it Out

	Enter a tweet or short sentence below and see real-time prediction:

	👉 Example:
	`"I absolutely loved the performance!"`
	Output: Positive (0.91)

	---

	## 🛠 Model Files

	You can also explore/download the trained artifacts here:
	- [`sentiment_model.keras`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/sentiment_model.keras)
	- [`tokenizer.joblib`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/tokenizer.joblib)

	---

	## 🧪 How It Works

	1. The input text is tokenized using the trained tokenizer (`joblib`).
	2. The padded sequence is passed through:
	- `Embedding → BiLSTM → BiLSTM → Attention → Dense Layers`
	3. The final sigmoid-activated output represents the probability of positivity.
	4. A confidence-aware label is returned using Gradio’s `Label` component.

	---

	## 📈 Model Performance

	Despite training from scratch without pretrained embeddings (like GloVe or FastText), the model performs comparably well. Experiments with `glove.27B.200d` embeddings yielded similar accuracy, and hence were excluded for clarity.

	Training plots and confusion matrix are available in the original [Kaggle Notebook](https://www.kaggle.com/code/mastershomya/sentiment-analysis-deep-bilstm).

	---

	## 🧑‍💻 Author

	Shomya Soneji
	Machine Learning & Deep Learning Enthusiast
	Connect on [Kaggle](https://www.kaggle.com/mastershomya)

	---

	## 🤝 Support

	If you find this project helpful, please consider giving it a 🌟 and sharing it!
	Your feedback and suggestions are always welcome 💬