Spaces:
Configuration error
Configuration error
Disaster Tweet Prediction
Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter (i.e. disaster relief organizations and news agencies). Therefore, in this task I am prediction whether a given tweet is about a real disaster or not. If so, predict a 1. If not, predict a 0.
Installation
Downloading the Data
- Clone this repository to your computer
- Navigate to the project directory
cd twitter-sentiment-analysisfrom your terminal - run
mkdir inputs - use
cd inputsto go into the directory where data should be stored - Download the data files from Kaggle
- Data can be found here
- If you don't have a Kaggle account you'd have to create one
Installing the requirements
- Install the requirements using
pip install -r requirements- The python version is Python 3.8
- You're better off using virtual environment
Usage
Navigate to the
srcdirectory usingcd srcin the project folder- Then run
python train.py - This will train an LSTM and create a directory with the
modelsdirectory calledPRETRAIN_WORD2VEC_LSTMwith the serialized LSTM and tokenizer inside it. - Once you've trained the model, you could run your own examples by running the
user_interface.pyscript in the top level directory. this will provide you with a private link. Once selected, input some text that you'd like to determine whether it's a disaster or not.
- Then run
View all explorations in
notebookdirectory
Extending This Work
Some ideas to extend this work:
- Methods to reduce inference time
- Use Different word embeddings
- Try LSTM with attention (See Attention in Long Short-Term Memory Recurrent Neural Networks)
- Use a transformer model
- Correct misspelled words
- Dealing with overfitting