File size: 2,658 Bytes
606820f 7952f86 fff6215 606820f b656299 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
---
title: π¦ TwittBERTO
emoji: π
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.42.2
app_file: app.py
pinned: false
---
This project demonstrates a sentiment analysis pipeline built with **DistilBERT**, a lightweight transformer model developed by Hugging Face. The model was fine-tuned on a dataset of 16,000 tweets to classify sentiment into categories such as **Positive**, **Negative**, and **Neutral**. The final model achieved an impressive **90% accuracy** on the validation set.
---
## π Features
* Utilizes **DistilBERT** for high-performance NLP with lower resource consumption.
* Cleaned and preprocessed Twitter data (16K rows).
* Fine-tuned with PyTorch and Hugging Face Transformers.
* Achieved **90%+ accuracy** on sentiment classification.
* Includes training, validation, and evaluation pipelines.
---
## π Dataset
* 16,000 manually labeled tweets with three sentiment classes:
* `Positive`
* `Negative`
* `Neutral`
* Dataset was preprocessed to remove mentions, hashtags, links, and special characters.
---
## π§ Model
* **Base Model**: `distilbert-base-uncased`
* **Fine-tuning**: Trained for several epochs using a cross-entropy loss function and AdamW optimizer.
* **Tokenizer**: Hugging Face `DistilBertTokenizerFast`
* **Training Framework**: PyTorch + Hugging Face `Trainer` API
---
## π Performance
| Metric | Score |
| --------- | ----- |
| Accuracy | 90% |
| Precision | High |
| Recall | High |
| F1-score | High |
> Note: Actual precision, recall, and F1-score values can be added if available.
---
## π¦ Dependencies
```bash
transformers==4.x.x
torch==1.x
scikit-learn
pandas
numpy
matplotlib
```
Install with:
```bash
pip install -r requirements.txt
```
---
## π οΈ How to Run
1. Clone the repository:
```bash
git clone https://github.com/yourusername/twitter-sentiment-distilbert.git
cd twitter-sentiment-distilbert
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Train the model:
```bash
python train.py
```
4. Evaluate the model:
```bash
python evaluate.py
```
5. Run prediction on new tweets:
```bash
python predict.py --text "I love this app!"
```
---
## π Example Output
```bash
Input: "I love this app!"
Predicted Sentiment: Positive
```
---
## π Future Improvements
* Integrate with a live Twitter API for real-time sentiment tracking.
* Add a web dashboard using Streamlit or Flask.
* Extend to multilingual support using `xlm-roberta`.
---
## π License
This project is open-source and available under the [MIT License](LICENSE).
---
|