Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,36 @@ sdk: gradio
|
|
| 7 |
sdk_version: 4.37.2
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
-
license:
|
| 11 |
---
|
|
|
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
sdk_version: 4.37.2
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
---
|
| 12 |
+
# Twitter Sentiment Analysis
|
| 13 |
|
| 14 |
+
This project implements a sentiment analysis model to predict the sentiment (positive or negative) of tweets. An LSTM-based model has been trained on 1.6 million tweets.
|
| 15 |
+
|
| 16 |
+
## Project Structure
|
| 17 |
+
|
| 18 |
+
- __01. Data Preparation:__
|
| 19 |
+
* `Data Collection`: The dataset consisting 1.6 million tweets has been collected from [here](https://www.kaggle.com/datasets/kazanova/sentiment140).
|
| 20 |
+
* `Data Cleaning & Preprocessing`:
|
| 21 |
+
- Removed stopwords
|
| 22 |
+
- Applied Lemmatization
|
| 23 |
+
- Vectorized the lemmatized data utilizing "TextVectorization" from keras
|
| 24 |
+
- Saved the Vectorizer for utilizing later in the app
|
| 25 |
+
|
| 26 |
+
- __02. Model Training:__
|
| 27 |
+
* A Bidirectional LSTM model with an embedding layer has been trained on the preprocessed data.
|
| 28 |
+
|
| 29 |
+
- __03. App Deployment:__
|
| 30 |
+
* Developed a web-app with Gradio interface
|
| 31 |
+
* Deployed the [App](https://huggingface.co/spaces/mazed/twitter_sentiment_analysis) in HuggingFace Spaces
|
| 32 |
+
|
| 33 |
+
- `requirements.txt`: Contains the dependencies needed for the project:
|
| 34 |
+
- `pandas`
|
| 35 |
+
- `tensorflow==2.15.0`
|
| 36 |
+
- `nltk`
|
| 37 |
+
- `gradio`
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|