Spaces:
Sleeping
Sleeping
Added icons for tech stack. Reordered sections
Browse files
README.md
CHANGED
|
@@ -29,19 +29,41 @@ Author: Kristine Karp (karpkristine@gmail.com)
|
|
| 29 |
- Flexible data sources - Support for both live Twitter API and mock data
|
| 30 |
- Containerized deployment - Full Docker orchestration for easy setup
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
## βοΈ Workflow Overview
|
| 33 |
This project implements a real-time sentiment analysis ETL pipeline using **Python scripts**, all orchestrated with **Docker**:
|
| 34 |
|
| 35 |
1. **Extract** - `producer.py` connects to Twitter API for live streaming, or `mock_tweet_producer.py` generates realistic demo data for testing or hugging face demonstration purposes.
|
| 36 |
2. **Transform** - Apache Kafka ingests tweets under 'sentiment-topic', while `consumer.py` applies sentiment analysis using Spark streaming.
|
| 37 |
3. **Load** - Processed results are published to 'sentiment-results' topic and displayed in real-time, also in `consumer.py`.
|
| 38 |
-
4. **
|
| 39 |
5. **Orchestrate** - `docker-compose.yml` and `Docker` manages all services for consistent deployment.
|
| 40 |
-
|
| 41 |
---
|
| 42 |
|
| 43 |
## π§ͺ How to Reproduce Locally
|
| 44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
**Option 1: Mock Mode (No API Required)**
|
| 46 |
|
| 47 |
```bash
|
|
@@ -76,25 +98,3 @@ command: ["python", "producer.py"]
|
|
| 76 |
```bash
|
| 77 |
docker-compose up --build
|
| 78 |
```
|
| 79 |
-
|
| 80 |
-
## π οΈ Project Structure
|
| 81 |
-
|
| 82 |
-
| File/Folder | Purpose |
|
| 83 |
-
|------------------------|---------------------------------------------------|
|
| 84 |
-
| `dashboard.py` | Flask app + Kafka consumer for real-time dashboard, flexible for real Kafka data or Hugging Face demo data |
|
| 85 |
-
| `templates/dashboard.html` | HTML UI template with real-time charts and tweet display |
|
| 86 |
-
| `mock_tweet_producer.py` | Generates realistic mock tweets for demo/testing |
|
| 87 |
-
| `producer.py` | Connects to Twitter API to stream live tweets |
|
| 88 |
-
| `consumer.py` | Runs Spark-based sentiment analysis on Kafka stream |
|
| 89 |
-
| `docker-compose.yml` | Docker setup orchestrating Kafka, Spark, producer, dashboard |
|
| 90 |
-
| `requirements.txt` | Python dependencies |
|
| 91 |
-
| `.env` (optional) | Contains Twitter API credentials |
|
| 92 |
-
|
| 93 |
-
## π§ Tech Stack
|
| 94 |
-
- Backend: Python, Flask, Flask-SocketIO
|
| 95 |
-
- Message Streaming: Apache Kafka
|
| 96 |
-
- Stream Processing: Apache Spark
|
| 97 |
-
- Frontend: HTML5, CSS3, JavaScript, Chart.js
|
| 98 |
-
- Real-time Communication: WebSocket
|
| 99 |
-
- Containerization: Docker, Docker Compose
|
| 100 |
-
- API Integration: Twitter API v2
|
|
|
|
| 29 |
- Flexible data sources - Support for both live Twitter API and mock data
|
| 30 |
- Containerized deployment - Full Docker orchestration for easy setup
|
| 31 |
|
| 32 |
+
## π§ Tech Stack
|
| 33 |
+
- Backend:π Python | πΆοΈ Flask | π Flask-SocketIO
|
| 34 |
+
- Message Streaming: π¨ Apache Kafka
|
| 35 |
+
- Stream Processing: β¨ Apache Spark
|
| 36 |
+
- Frontend* πΌοΈ HTML5 | π¨ CSS3 | β‘ JavaScript | π Chart.js
|
| 37 |
+
- Real-time Communication: π WebSocket
|
| 38 |
+
- Containerization: π³ Docker | π¦ Docker Compose
|
| 39 |
+
- API Integration: π¦ Twitter API v2
|
| 40 |
+
|
| 41 |
## βοΈ Workflow Overview
|
| 42 |
This project implements a real-time sentiment analysis ETL pipeline using **Python scripts**, all orchestrated with **Docker**:
|
| 43 |
|
| 44 |
1. **Extract** - `producer.py` connects to Twitter API for live streaming, or `mock_tweet_producer.py` generates realistic demo data for testing or hugging face demonstration purposes.
|
| 45 |
2. **Transform** - Apache Kafka ingests tweets under 'sentiment-topic', while `consumer.py` applies sentiment analysis using Spark streaming.
|
| 46 |
3. **Load** - Processed results are published to 'sentiment-results' topic and displayed in real-time, also in `consumer.py`.
|
| 47 |
+
4. **Visualize** - `dashboard.py` provides a web interface with live sentiment metrics and trend charts.
|
| 48 |
5. **Orchestrate** - `docker-compose.yml` and `Docker` manages all services for consistent deployment.
|
| 49 |
+
|
| 50 |
---
|
| 51 |
|
| 52 |
## π§ͺ How to Reproduce Locally
|
| 53 |
|
| 54 |
+
**π οΈ Project Structure**
|
| 55 |
+
|
| 56 |
+
| File/Folder | Purpose |
|
| 57 |
+
|------------------------|---------------------------------------------------|
|
| 58 |
+
| `dashboard.py` | Flask app + Kafka consumer for real-time dashboard, flexible for real Kafka data or Hugging Face demo data |
|
| 59 |
+
| `templates/dashboard.html` | HTML UI template with real-time charts and tweet display |
|
| 60 |
+
| `mock_tweet_producer.py` | Generates realistic mock tweets for demo/testing |
|
| 61 |
+
| `producer.py` | Connects to Twitter API to stream live tweets |
|
| 62 |
+
| `consumer.py` | Runs Spark-based sentiment analysis on Kafka stream |
|
| 63 |
+
| `docker-compose.yml` | Docker setup orchestrating Kafka, Spark, producer, dashboard |
|
| 64 |
+
| `requirements.txt` | Python dependencies |
|
| 65 |
+
| `.env` (optional) | Contains Twitter API credentials |
|
| 66 |
+
|
| 67 |
**Option 1: Mock Mode (No API Required)**
|
| 68 |
|
| 69 |
```bash
|
|
|
|
| 98 |
```bash
|
| 99 |
docker-compose up --build
|
| 100 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|