sentiment-analysis / README.md
xtinkarpiu's picture
Added icons for tech stack. Reordered sections
1159516 verified
---
title: Real-Time Sentiment Analysis Dashboard
emoji: πŸ“Š
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
# πŸ“Š Sentiment Analysis Dashboard
A real-time sentiment analysis dashboard that processes tweets and displays sentiment trends using Apache Kafka, Spark, and Docker.
**Live Demo Dashboard**: [https://huggingface.co/spaces/xtinkarpiu/sentiment-analysis](https://huggingface.co/spaces/xtinkarpiu/sentiment-analysis). *Demo runs in mock mode with simulated tweets*
Author: Kristine Karp (karpkristine@gmail.com)
## πŸ“Έ Preview
![Dashboard Overview](assets/dashboard_screenshot1.jpg)
![Real-Time Tweets and Charts](assets/dashboard_screenshot2.jpg)
## 🌟 Features
- Real-time tweet processing - Live streaming with Apache Kafka
- Intelligent sentiment analysis - Keyword-based classification with Spark
- Live dashboard updates - WebSocket-powered real-time interface
- Comprehensive visualization - Sentiment trends and recent tweet streams
- Flexible data sources - Support for both live Twitter API and mock data
- Containerized deployment - Full Docker orchestration for easy setup
## πŸ”§ Tech Stack
- Backend:🐍 Python | 🌢️ Flask | πŸ”Œ Flask-SocketIO
- Message Streaming: πŸ“¨ Apache Kafka
- Stream Processing: ✨ Apache Spark
- Frontend* πŸ–ΌοΈ HTML5 | 🎨 CSS3 | ⚑ JavaScript | πŸ“Š Chart.js
- Real-time Communication: πŸ”„ WebSocket
- Containerization: 🐳 Docker | πŸ“¦ Docker Compose
- API Integration: 🐦 Twitter API v2
## βš™οΈ Workflow Overview
This project implements a real-time sentiment analysis ETL pipeline using **Python scripts**, all orchestrated with **Docker**:
1. **Extract** - `producer.py` connects to Twitter API for live streaming, or `mock_tweet_producer.py` generates realistic demo data for testing or hugging face demonstration purposes.
2. **Transform** - Apache Kafka ingests tweets under 'sentiment-topic', while `consumer.py` applies sentiment analysis using Spark streaming.
3. **Load** - Processed results are published to 'sentiment-results' topic and displayed in real-time, also in `consumer.py`.
4. **Visualize** - `dashboard.py` provides a web interface with live sentiment metrics and trend charts.
5. **Orchestrate** - `docker-compose.yml` and `Docker` manages all services for consistent deployment.
---
## πŸ§ͺ How to Reproduce Locally
**πŸ› οΈ Project Structure**
| File/Folder | Purpose |
|------------------------|---------------------------------------------------|
| `dashboard.py` | Flask app + Kafka consumer for real-time dashboard, flexible for real Kafka data or Hugging Face demo data |
| `templates/dashboard.html` | HTML UI template with real-time charts and tweet display |
| `mock_tweet_producer.py` | Generates realistic mock tweets for demo/testing |
| `producer.py` | Connects to Twitter API to stream live tweets |
| `consumer.py` | Runs Spark-based sentiment analysis on Kafka stream |
| `docker-compose.yml` | Docker setup orchestrating Kafka, Spark, producer, dashboard |
| `requirements.txt` | Python dependencies |
| `.env` (optional) | Contains Twitter API credentials |
**Option 1: Mock Mode (No API Required)**
```bash
git clone https://huggingface.co/spaces/xtinkarpiu/sentiment-analysis
cd sentiment-analysis
docker-compose up --build
```
This launches:
- Kafka: Message broker for tweet streaming
- Spark: Real-time sentiment analysis processing
- Producer: Tweet ingestion (default: mock tweets)
- Dashboard: Web interface at http://localhost:5000
**Option 2: Live Twitter Integration**
1. Create a Twitter/X Developer App at [developer.twitter.com](https://developer.twitter.com). Ensure your app has read and write permissions enabled, and your API access level is at least Basic to access streaming endpoints.
2. Add your **Bearer Token** to a `.env` file:
```env
BEARER_TOKEN=your_token_here
```
3. Set mock mode to false in `app.py`:
```python
os.environ["USE_MOCK"] = "false"
```
4. In `docker-compose.yml`, under sentiment-producer, replace the command to run producer.py instead of mock_tweet_producer.py:
```python
command: ["python", "producer.py"]
```
5. Restart the Docker Compose stack to begin processing live tweets.
```bash
docker-compose up --build
```