new_thing / README.md
junaid17's picture
Update README.md
7952f86 verified
---
title: 🐦 TwittBERTO
emoji: πŸš—
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.42.2
app_file: app.py
pinned: false
---
This project demonstrates a sentiment analysis pipeline built with **DistilBERT**, a lightweight transformer model developed by Hugging Face. The model was fine-tuned on a dataset of 16,000 tweets to classify sentiment into categories such as **Positive**, **Negative**, and **Neutral**. The final model achieved an impressive **90% accuracy** on the validation set.
---
## πŸš€ Features
* Utilizes **DistilBERT** for high-performance NLP with lower resource consumption.
* Cleaned and preprocessed Twitter data (16K rows).
* Fine-tuned with PyTorch and Hugging Face Transformers.
* Achieved **90%+ accuracy** on sentiment classification.
* Includes training, validation, and evaluation pipelines.
---
## πŸ“ Dataset
* 16,000 manually labeled tweets with three sentiment classes:
* `Positive`
* `Negative`
* `Neutral`
* Dataset was preprocessed to remove mentions, hashtags, links, and special characters.
---
## 🧠 Model
* **Base Model**: `distilbert-base-uncased`
* **Fine-tuning**: Trained for several epochs using a cross-entropy loss function and AdamW optimizer.
* **Tokenizer**: Hugging Face `DistilBertTokenizerFast`
* **Training Framework**: PyTorch + Hugging Face `Trainer` API
---
## πŸ“Š Performance
| Metric | Score |
| --------- | ----- |
| Accuracy | 90% |
| Precision | High |
| Recall | High |
| F1-score | High |
> Note: Actual precision, recall, and F1-score values can be added if available.
---
## πŸ“¦ Dependencies
```bash
transformers==4.x.x
torch==1.x
scikit-learn
pandas
numpy
matplotlib
```
Install with:
```bash
pip install -r requirements.txt
```
---
## πŸ› οΈ How to Run
1. Clone the repository:
```bash
git clone https://github.com/yourusername/twitter-sentiment-distilbert.git
cd twitter-sentiment-distilbert
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Train the model:
```bash
python train.py
```
4. Evaluate the model:
```bash
python evaluate.py
```
5. Run prediction on new tweets:
```bash
python predict.py --text "I love this app!"
```
---
## πŸ“ˆ Example Output
```bash
Input: "I love this app!"
Predicted Sentiment: Positive
```
---
## πŸ“š Future Improvements
* Integrate with a live Twitter API for real-time sentiment tracking.
* Add a web dashboard using Streamlit or Flask.
* Extend to multilingual support using `xlm-roberta`.
---
## πŸ“„ License
This project is open-source and available under the [MIT License](LICENSE).
---