Spaces:

junaid17
/

new_thing

Sleeping

File size: 2,658 Bytes

---
title: 🐦 TwittBERTO
emoji: 🚗
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.42.2
app_file: app.py
pinned: false
---


This project demonstrates a sentiment analysis pipeline built with **DistilBERT**, a lightweight transformer model developed by Hugging Face. The model was fine-tuned on a dataset of 16,000 tweets to classify sentiment into categories such as **Positive**, **Negative**, and **Neutral**. The final model achieved an impressive **90% accuracy** on the validation set.

---

## 🚀 Features

* Utilizes **DistilBERT** for high-performance NLP with lower resource consumption.
* Cleaned and preprocessed Twitter data (16K rows).
* Fine-tuned with PyTorch and Hugging Face Transformers.
* Achieved **90%+ accuracy** on sentiment classification.
* Includes training, validation, and evaluation pipelines.

---

## 📁 Dataset

* 16,000 manually labeled tweets with three sentiment classes:

  * `Positive`
  * `Negative`
  * `Neutral`
* Dataset was preprocessed to remove mentions, hashtags, links, and special characters.

---

## 🧠 Model

* **Base Model**: `distilbert-base-uncased`
* **Fine-tuning**: Trained for several epochs using a cross-entropy loss function and AdamW optimizer.
* **Tokenizer**: Hugging Face `DistilBertTokenizerFast`
* **Training Framework**: PyTorch + Hugging Face `Trainer` API

---

## 📊 Performance

| Metric    | Score |
| --------- | ----- |
| Accuracy  | 90%   |
| Precision | High  |
| Recall    | High  |
| F1-score  | High  |

> Note: Actual precision, recall, and F1-score values can be added if available.

---

## 📦 Dependencies

```bash
transformers==4.x.x
torch==1.x
scikit-learn
pandas
numpy
matplotlib
```

Install with:

```bash
pip install -r requirements.txt
```

---

## 🛠️ How to Run

1. Clone the repository:

   ```bash
   git clone https://github.com/yourusername/twitter-sentiment-distilbert.git
   cd twitter-sentiment-distilbert
   ```

2. Install dependencies:

   ```bash
   pip install -r requirements.txt
   ```

3. Train the model:

   ```bash
   python train.py
   ```

4. Evaluate the model:

   ```bash
   python evaluate.py
   ```

5. Run prediction on new tweets:

   ```bash
   python predict.py --text "I love this app!"
   ```

---

## 📈 Example Output

```bash
Input: "I love this app!"
Predicted Sentiment: Positive
```

---

## 📚 Future Improvements

* Integrate with a live Twitter API for real-time sentiment tracking.
* Add a web dashboard using Streamlit or Flask.
* Extend to multilingual support using `xlm-roberta`.

---

## 📄 License

This project is open-source and available under the [MIT License](LICENSE).

---