Disaster Tweet Classification Model

Description

This project involves developing a machine learning model to classify tweets as indicating a disaster or not. Utilizing Deep Learning techniques, specifically a fine-tuned model from the Hugging Face library, the system is trained on the disaster tweet dataset from Kaggle. The goal is to predict whether a given tweet refers to a disaster event based on its content.

By analyzing critical components of tweets, such as content and context, the BERT model leverages its deep understanding of language to accurately classify whether a tweet indicates a disaster. The model is trained on a comprehensive dataset of disaster-related tweets, enabling it to effectively differentiate between disaster and non-disaster tweets across various contexts.

This classification system can be utilized by emergency responders, news organizations, and social media analysts to quickly identify and respond to disaster-related events or to monitor trends in disaster-related communications.

Technologies Used

Dataset

Source: Kaggle Disaster Tweets Dataset
Purpose: Contains tweets labeled to indicate whether they refer to a disaster.

Model

Base Model: BERT (bert-base-uncased)
Library: Hugging Face transformers
Task: Binary text classification

Approach

Preprocessing:
- Load and preprocess the disaster tweet dataset.
- Tokenize the tweet texts.
Fine-Tuning:
- Fine-tune the BERT model on the preprocessed disaster tweet dataset.
Training:
- Train the model to distinguish between disaster and non-disaster tweets.

Key Technologies

Deep Learning (BERT): For advanced text classification and contextual understanding.
Natural Language Processing (NLP): For text preprocessing and analysis.
Machine Learning Algorithms: For model training and prediction tasks.

Google Colab Notebook

You can view and run the Google Colab notebook for this project here.

Acknowledgements

Hugging Face for transformer models.
NLTK for natural language processing.
Streamlit for creating the interactive web interface.

Author

@AdilHayat

Feedback

If you have any feedback, please reach out to us at hayatadil300@gmail.com.

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support