| --- |
| title: PulseAI |
| emoji: π¦ |
| colorFrom: green |
| colorTo: indigo |
| sdk: docker |
| pinned: false |
| --- |
| |
|
|
| # π Social Intelligence Platform |
|
|
| **AI-powered brand monitoring, sentiment analysis, and competitive intelligence** |
|
|
| A production-grade NLP platform that helps product teams discover customer insights, detect brand crises, and track competitive signals β all in real-time. |
|
|
| --- |
|
|
| ## π― Problem Solved |
|
|
| **Before:** Product teams were drowning in thousands of reviews and social posts, manually trying to identify recurring themes, sentiment trends, and competitive threats. By the time they spotted a brand crisis, it had already gone viral. |
|
|
| **After:** Automated NLP pipeline processes all customer conversations in real-time, surfacing actionable insights: |
| - **Sentiment Analysis** β BERT-powered classification with aspect-level granularity |
| - **Topic Discovery** β NMF clustering finds recurring themes automatically |
| - **Crisis Detection** β Multi-signal scoring catches PR disasters before they escalate |
| - **Trend Forecasting** β Statistical forecasting predicts sentiment trajectory |
| - **Competitor Intelligence** β Tracks competitor mentions and switch signals |
|
|
| --- |
|
|
| ## β¨ Key Features |
|
|
| ### π§ NLP Pipeline |
| - **BERT Sentiment Analysis** (`cardiffnlp/twitter-roberta-base-sentiment-latest`) |
| - Document-level sentiment (positive/negative/neutral) |
| - Aspect-based sentiment extraction (Performance, Pricing, Support, UI, etc.) |
| - Confidence scoring with fallback to VADER/keyword analysis |
| |
| - **Topic Modeling** (NMF + TF-IDF) |
| - Automated topic discovery from short-text corpus |
| - Named clusters with keyword extraction |
| - Sentiment distribution per topic |
| |
| - **Trend Analysis & Forecasting** |
| - Rolling statistical analysis with anomaly detection |
| - Exponential smoothing for 14-day sentiment forecast |
| - Volume trend analysis and spike detection |
| |
| - **Crisis Detection Engine** |
| - Multi-signal crisis scoring (legal, data breach, outrage, viral threats) |
| - Severity classification (low/medium/high/critical) |
| - Engagement amplification (viral posts get higher weight) |
| |
| - **Competitor Intelligence** |
| - Competitor mention extraction and sentiment comparison |
| - Switch signal detection (users leaving competitors) |
| - Opportunity gap identification |
|
|
| ### π¨ Dashboard Features |
| - **Real-time KPIs** β Sentiment score, NPS estimate, volume trends, crisis alerts |
| - **Interactive Visualizations** β Time series, donut charts, topic bubbles, competitor comparison |
| - **Topic Explorer** β Click-to-explore topic clusters with keyword clouds |
| - **Crisis Radar** β Prioritized list of high-severity posts requiring action |
| - **Live Analyzer** β Real-time sentiment + aspect + crisis analysis for any text |
| - **Post Feed** β Filterable feed with sentiment labels and source badges |
|
|
| --- |
|
|
| ## ποΈ Architecture |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Frontend (Vanilla JS) β |
| β β’ Dark SaaS UI with Syne/Instrument Sans typography β |
| β β’ Chart.js for time series, D3.js for topic bubbles β |
| β β’ Real-time API polling, demo fallback when offline β |
| ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ |
| β REST API |
| ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ |
| β FastAPI Backend (Python) β |
| β β’ /api/dashboard β Full analytics payload β |
| β β’ /api/analyze β Single text sentiment + crisis scoring β |
| β β’ /api/topics β Topic clusters with examples β |
| β β’ /api/trends β Time series + forecast β |
| β β’ /api/competitors β Competitive intelligence β |
| ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ |
| β |
| ββββββββββββββββ΄βββββββββββββββ |
| βΌ βΌ |
| βββββββββββββββββββββ βββββββββββββββββββββββββ |
| β NLP Pipeline β β Sample Data Gen β |
| β β’ sentiment.py β β β’ 500 synthetic β |
| β β’ topic_model.py β β reviews/tweets β |
| β β’ trends.py β β β’ Realistic crisis β |
| β β’ crisis.py β β scenarios β |
| β β’ competitor.py β β β’ Time series data β |
| βββββββββββββββββββββ βββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## π¦ Tech Stack |
|
|
| **Backend:** |
| - FastAPI β Modern async Python web framework |
| - Transformers (Hugging Face) β BERT sentiment model |
| - scikit-learn β NMF topic modeling, TF-IDF vectorization |
| - NumPy/Pandas β Statistical analysis and data manipulation |
| - NLTK β Fallback sentiment analysis (VADER) |
|
|
| **Frontend:** |
| - Vanilla JavaScript (no framework dependencies) |
| - Chart.js β Time series and bar/donut charts |
| - D3.js β Topic bubble visualization |
| - Custom CSS β Dark enterprise SaaS design system |
| - Fonts: Syne (display), Instrument Sans (body), DM Mono (code) |
|
|
| **Models:** |
| - Primary: `cardiffnlp/twitter-roberta-base-sentiment-latest` (RoBERTa fine-tuned on 124M tweets) |
| - Fallback: VADER lexicon-based sentiment (works offline) |
|
|
| --- |
|
|
| ## π Quick Start |
|
|
| ### Prerequisites |
| - Python 3.8+ |
| - pip (Python package manager) |
| - Modern web browser (Chrome, Firefox, Safari, Edge) |
|
|
| ### Installation |
|
|
| 1. **Extract the project** |
| ```bash |
| unzip social-intelligence-platform.zip |
| cd social-intelligence-platform |
| ``` |
|
|
| 2. **Install Python dependencies** |
| ```bash |
| cd backend |
| pip install -r requirements.txt |
| ``` |
|
|
| 3. **Download NLTK data (for fallback sentiment)** |
| ```bash |
| python -c "import nltk; nltk.download('vader_lexicon')" |
| ``` |
|
|
| ### Running the Application |
|
|
| #### Option 1: Run Backend + Frontend (Recommended) |
|
|
| **Terminal 1 β Start Backend:** |
| ```bash |
| cd backend |
| python main.py |
| ``` |
|
|
| The backend will: |
| - Start on `http://localhost:8000` |
| - Generate 500 sample posts on startup |
| - Run BERT sentiment analysis (or fallback to VADER if model unavailable) |
| - Fit topic model (NMF) |
| - Build trend forecasts |
| - Scan for crisis signals |
| - Assemble competitor intelligence |
|
|
| This takes **15-30 seconds** on first run (model download + bootstrap). |
|
|
| **Terminal 2 β Serve Frontend:** |
| ```bash |
| cd frontend |
| python -m http.server 3000 |
| ``` |
|
|
| Open browser to: **http://localhost:3000** |
|
|
| #### Option 2: Frontend Only (Demo Mode) |
|
|
| If the backend is unavailable, the frontend falls back to **demo data** automatically. |
|
|
| ```bash |
| cd frontend |
| python -m http.server 3000 |
| ``` |
|
|
| Open browser to: **http://localhost:3000** |
|
|
| You'll see "Backend offline β showing demo data" during load. The dashboard will render with pre-generated synthetic data. |
|
|
| --- |
|
|
| ## π Usage Guide |
|
|
| ### Dashboard Views |
|
|
| **1. Dashboard (Home)** |
| - Overview KPIs: Sentiment score, volume, NPS estimate, crisis alert level |
| - 90-day sentiment trend with forecast |
| - Sentiment mix (donut chart) |
| - Volume by source (Twitter, Reddit, G2, etc.) |
| - Top crisis posts requiring immediate action |
| - Recent post feed with filters |
|
|
| **2. Trends** |
| - 7-day vs 30-day sentiment comparison |
| - Trend direction (improving/declining/stable) |
| - 14-day forecast with confidence bands |
| - Anomaly detection (spikes and dips) |
| - Daily volume trend |
|
|
| **3. Topic Clusters** |
| - 8 auto-discovered topics with keyword weights |
| - Interactive bubble chart (size = post volume) |
| - Click to explore: top keywords, sample posts, sentiment distribution |
|
|
| **4. Crisis Radar** |
| - Overall alert level (π’ Low β π΄ Critical) |
| - Active high-severity posts |
| - Signal frequency breakdown (legal, data breach, outrage, etc.) |
| - Recommended actions |
|
|
| **5. Competitors** |
| - Sentiment comparison across brands |
| - Share of voice (% of corpus mentions) |
| - Opportunity intelligence (AI-identified competitive gaps) |
| - Switch signal detection |
|
|
| **6. Live Analyzer** |
| - Paste any text for real-time analysis |
| - Returns: sentiment label, confidence, crisis score, aspect breakdown |
| - Quick example templates |
|
|
| **7. Post Feed** |
| - Full scrollable feed with sentiment labels |
| - Filter by positive/negative/neutral/crisis |
| - Topic tags and source badges |
|
|
| ### API Endpoints |
|
|
| ```bash |
| # Health check |
| GET http://localhost:8000/api/health |
| |
| # Full dashboard data |
| GET http://localhost:8000/api/dashboard |
| |
| # Summary metrics only |
| GET http://localhost:8000/api/summary |
| |
| # Topic clusters |
| GET http://localhost:8000/api/topics |
| |
| # Trend analysis + forecast |
| GET http://localhost:8000/api/trends |
| |
| # Crisis scan results |
| GET http://localhost:8000/api/crisis |
| |
| # Competitor intelligence |
| GET http://localhost:8000/api/competitors |
| |
| # Post feed (with filters) |
| GET http://localhost:8000/api/posts?limit=50&sentiment=negative&source=Twitter |
| |
| # Analyze single text |
| POST http://localhost:8000/api/analyze |
| Body: {"text": "Your review text here", "include_aspects": true, "include_crisis": true} |
| |
| # Batch analysis |
| POST http://localhost:8000/api/batch-analyze |
| Body: {"texts": ["Review 1", "Review 2", "Review 3"]} |
| ``` |
|
|
| --- |
|
|
| ## π§ͺ Sample Data |
|
|
| The platform generates **500 realistic posts** on startup: |
| - **60% Positive** β Praise for features, support, UI |
| - **25% Negative** β Complaints about performance, pricing, bugs |
| - **10% Neutral** β Migration stories, feature requests |
| - **5% Crisis** β Data breaches, outages, legal threats, scams |
|
|
| **Sources:** Twitter, Reddit, G2, Trustpilot, ProductHunt, AppStore, LinkedIn |
|
|
| **Time Range:** Last 90 days with recency bias (more recent posts) |
|
|
| **Topics Covered:** |
| - Performance & Speed |
| - Customer Support |
| - Pricing & Billing |
| - UI & Design |
| - Features & Integrations |
| - Data Quality & Accuracy |
| - Onboarding & Documentation |
| - Security & Compliance |
|
|
| **Competitor Mentions:** RivalOne, CompeteX, AltStream appear in ~15% of posts |
|
|
| **Crisis Cluster:** Injected 7 days ago to simulate a real brand crisis event |
|
|
| --- |
|
|
| ## π¨ Design System |
|
|
| The UI uses a **dark enterprise SaaS aesthetic** inspired by Linear, Vercel, and Notion: |
|
|
| **Colors:** |
| - `--bg-void: #080b12` β Deep background |
| - `--bg-surface: #111827` β Card backgrounds |
| - `--blue-500: #5b9cf6` β Primary accent |
| - `--green-500: #10b981` β Positive sentiment |
| - `--red-500: #ef4444` β Negative sentiment / crisis |
| - `--amber-500: #f59e0b` β Warnings / neutral |
|
|
| **Typography:** |
| - **Display (Headings):** Syne β Bold, modern, slightly geometric |
| - **Body (UI Text):** Instrument Sans β Clean, readable, professional |
| - **Monospace (Data):** DM Mono β Metrics, badges, code |
|
|
| **Layout:** |
| - Sidebar navigation (240px fixed) |
| - Header with search and status indicators |
| - Card-based grid system |
| - Consistent 16px/20px/24px spacing rhythm |
|
|
| **Animations:** |
| - Staggered fade-in on page load |
| - Smooth chart transitions (800ms easing) |
| - Hover states with subtle elevation |
| - Loading states with branded skeleton screens |
|
|
| --- |
|
|
| ## π§ Configuration |
|
|
| ### Backend Settings |
|
|
| **Model Selection** (in `backend/nlp/sentiment.py`): |
| ```python |
| MODEL_ID = "cardiffnlp/twitter-roberta-base-sentiment-latest" # Primary model |
| FALLBACK_MODE = False # Set True to skip transformer download |
| ``` |
|
|
| **Topic Count** (in `backend/main.py`): |
| ```python |
| modeler = get_modeler(n_topics=8) # Adjust number of topics |
| ``` |
|
|
| **Sample Data Size** (in `backend/main.py`): |
| ```python |
| _corpus = generate_posts(n=500) # Generate 500 posts (adjust as needed) |
| ``` |
|
|
| ### Crisis Detection Thresholds |
|
|
| Edit `backend/nlp/crisis_detector.py`: |
| ```python |
| ALERT_LEVELS = { |
| (0, 4): ("low", "π’", "No action required."), |
| (4, 8): ("medium", "π‘", "Monitor closely."), |
| (8, 15): ("high", "π ", "Escalate to communications team."), |
| (15, 99): ("critical", "π΄", "Activate crisis response immediately."), |
| } |
| ``` |
|
|
| --- |
|
|
| ## π Performance Notes |
|
|
| **First Run:** |
| - Model download: ~440MB (RoBERTa weights) |
| - Bootstrap time: 15-30 seconds (sentiment + topic modeling + trends) |
|
|
| **Subsequent Runs:** |
| - Model loads from cache: ~3-5 seconds |
| - Bootstrap time: 5-10 seconds |
|
|
| **Runtime Performance:** |
| - Sentiment analysis: ~50ms per post (transformer mode) |
| - Topic modeling fit: ~2 seconds (500 posts, 8 topics) |
| - Trend forecast: <1 second (90-day series) |
| - Dashboard payload: ~1 second (full analysis) |
|
|
| **Offline Mode:** |
| - If transformers unavailable: Falls back to VADER (100x faster) |
| - If backend offline: Frontend uses demo data (instant load) |
|
|
| --- |
|
|
| ## π Production Deployment |
|
|
| This is a **demo/portfolio project**. For production use: |
|
|
| 1. **Replace sample data** with real data sources: |
| - Twitter API / Reddit API / Review aggregators |
| - Implement proper data ingestion pipeline |
| - Add database (PostgreSQL / MongoDB) for persistence |
|
|
| 2. **Fine-tune the BERT model** on your domain: |
| - Collect labeled training data from your industry |
| - Fine-tune on HuggingFace Trainer |
| - Deploy custom model endpoint |
|
|
| 3. **Add authentication**: |
| - OAuth 2.0 / JWT tokens |
| - User accounts and multi-tenancy |
| - API rate limiting |
|
|
| 4. **Scale the backend**: |
| - Containerize with Docker |
| - Deploy to AWS/GCP/Azure |
| - Add Redis cache for analytics |
| - Use Celery for async NLP jobs |
|
|
| 5. **Enhance frontend**: |
| - Add React/Vue for state management |
| - Implement WebSocket for real-time updates |
| - Add export to PDF/CSV functionality |
|
|
| --- |
|
|
| ## π Project Structure |
|
|
| ``` |
| social-intelligence-platform/ |
| βββ backend/ |
| β βββ main.py # FastAPI application |
| β βββ requirements.txt # Python dependencies |
| β βββ data/ |
| β β βββ __init__.py |
| β β βββ sample_data.py # Synthetic data generator |
| β βββ nlp/ |
| β βββ __init__.py |
| β βββ sentiment.py # BERT sentiment pipeline |
| β βββ topic_model.py # NMF topic modeling |
| β βββ trend_analysis.py # Time series forecasting |
| β βββ crisis_detector.py # Crisis scoring engine |
| β βββ competitor_intel.py # Competitor mention analysis |
| βββ frontend/ |
| β βββ index.html # Dashboard UI (self-contained) |
| βββ docs/ |
| β βββ CASE_STUDY.md # Detailed project writeup |
| βββ README.md # This file |
| ``` |
|
|
| --- |
|
|
| ## π Skills Demonstrated |
|
|
| ### NLP & Machine Learning |
| - β
BERT/Transformer fine-tuning and inference |
| - β
Topic modeling (NMF, LDA alternatives) |
| - β
Time series forecasting (exponential smoothing) |
| - β
Aspect-based sentiment analysis |
| - β
Anomaly detection (statistical outliers) |
| - β
Multi-signal classification (crisis scoring) |
|
|
| ### Backend Engineering |
| - β
FastAPI REST API design |
| - β
Async Python patterns |
| - β
Model serving and caching |
| - β
Batch processing pipelines |
| - β
Error handling and fallbacks |
|
|
| ### Frontend Development |
| - β
Modern vanilla JS (no framework bloat) |
| - β
Chart.js and D3.js visualizations |
| - β
Responsive CSS Grid layouts |
| - β
Design system implementation |
| - β
Performance optimization (lazy loading, debouncing) |
|
|
| ### Product Thinking |
| - β
Problem-first approach (not technology-first) |
| - β
User-centered design (product teams, not ML researchers) |
| - β
Actionable insights over raw metrics |
| - β
Crisis prioritization and triage |
|
|
| --- |
|
|
| ## π§ Questions? |
|
|
| This project demonstrates production-ready NLP engineering, API design, and data visualization skills. Built to solve real product team pain points with modern ML techniques. |
|
|
| **Author:** [Your Name] |
| **Portfolio:** [Your Portfolio URL] |
| **GitHub:** [Your GitHub] |
| **LinkedIn:** [Your LinkedIn] |
|
|
| --- |
|
|
| ## π License |
|
|
| MIT License β Free to use for educational and portfolio purposes. |
|
|
| --- |
|
|
| **Built with:** π Python β’ β‘ FastAPI β’ π€ Transformers β’ π Chart.js β’ π¨ Custom CSS |
|
|
|
|
|
|
|
|