sentiment-analysis / README.md
xtinkarpiu's picture
Added icons for tech stack. Reordered sections
1159516 verified
metadata
title: Real-Time Sentiment Analysis Dashboard
emoji: πŸ“Š
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false

πŸ“Š Sentiment Analysis Dashboard

A real-time sentiment analysis dashboard that processes tweets and displays sentiment trends using Apache Kafka, Spark, and Docker.

Live Demo Dashboard: https://huggingface.co/spaces/xtinkarpiu/sentiment-analysis. Demo runs in mock mode with simulated tweets

Author: Kristine Karp (karpkristine@gmail.com)

πŸ“Έ Preview

Dashboard Overview Real-Time Tweets and Charts

🌟 Features

  • Real-time tweet processing - Live streaming with Apache Kafka
  • Intelligent sentiment analysis - Keyword-based classification with Spark
  • Live dashboard updates - WebSocket-powered real-time interface
  • Comprehensive visualization - Sentiment trends and recent tweet streams
  • Flexible data sources - Support for both live Twitter API and mock data
  • Containerized deployment - Full Docker orchestration for easy setup

πŸ”§ Tech Stack

  • Backend:🐍 Python | 🌢️ Flask | πŸ”Œ Flask-SocketIO
  • Message Streaming: πŸ“¨ Apache Kafka
  • Stream Processing: ✨ Apache Spark
  • Frontend* πŸ–ΌοΈ HTML5 | 🎨 CSS3 | ⚑ JavaScript | πŸ“Š Chart.js
  • Real-time Communication: πŸ”„ WebSocket
  • Containerization: 🐳 Docker | πŸ“¦ Docker Compose
  • API Integration: 🐦 Twitter API v2

βš™οΈ Workflow Overview

This project implements a real-time sentiment analysis ETL pipeline using Python scripts, all orchestrated with Docker:

  1. Extract - producer.py connects to Twitter API for live streaming, or mock_tweet_producer.py generates realistic demo data for testing or hugging face demonstration purposes.
  2. Transform - Apache Kafka ingests tweets under 'sentiment-topic', while consumer.py applies sentiment analysis using Spark streaming.
  3. Load - Processed results are published to 'sentiment-results' topic and displayed in real-time, also in consumer.py.
  4. Visualize - dashboard.py provides a web interface with live sentiment metrics and trend charts.
  5. Orchestrate - docker-compose.yml and Docker manages all services for consistent deployment.

πŸ§ͺ How to Reproduce Locally

πŸ› οΈ Project Structure

File/Folder Purpose
dashboard.py Flask app + Kafka consumer for real-time dashboard, flexible for real Kafka data or Hugging Face demo data
templates/dashboard.html HTML UI template with real-time charts and tweet display
mock_tweet_producer.py Generates realistic mock tweets for demo/testing
producer.py Connects to Twitter API to stream live tweets
consumer.py Runs Spark-based sentiment analysis on Kafka stream
docker-compose.yml Docker setup orchestrating Kafka, Spark, producer, dashboard
requirements.txt Python dependencies
.env (optional) Contains Twitter API credentials

Option 1: Mock Mode (No API Required)

git clone https://huggingface.co/spaces/xtinkarpiu/sentiment-analysis
cd sentiment-analysis
docker-compose up --build

This launches:

  • Kafka: Message broker for tweet streaming
  • Spark: Real-time sentiment analysis processing
  • Producer: Tweet ingestion (default: mock tweets)
  • Dashboard: Web interface at http://localhost:5000

Option 2: Live Twitter Integration

  1. Create a Twitter/X Developer App at developer.twitter.com. Ensure your app has read and write permissions enabled, and your API access level is at least Basic to access streaming endpoints.
  2. Add your Bearer Token to a .env file:
BEARER_TOKEN=your_token_here
  1. Set mock mode to false in app.py:
os.environ["USE_MOCK"] = "false"
  1. In docker-compose.yml, under sentiment-producer, replace the command to run producer.py instead of mock_tweet_producer.py:
command: ["python", "producer.py"]
  1. Restart the Docker Compose stack to begin processing live tweets.
docker-compose up --build