PulseAI / README.md
aasthav18's picture
Fix Hugging Face README config
0eb1d54
metadata
title: PulseAI
emoji: πŸ¦€
colorFrom: green
colorTo: indigo
sdk: docker
pinned: false

πŸš€ Social Intelligence Platform

AI-powered brand monitoring, sentiment analysis, and competitive intelligence

A production-grade NLP platform that helps product teams discover customer insights, detect brand crises, and track competitive signals β€” all in real-time.


🎯 Problem Solved

Before: Product teams were drowning in thousands of reviews and social posts, manually trying to identify recurring themes, sentiment trends, and competitive threats. By the time they spotted a brand crisis, it had already gone viral.

After: Automated NLP pipeline processes all customer conversations in real-time, surfacing actionable insights:

  • Sentiment Analysis β€” BERT-powered classification with aspect-level granularity
  • Topic Discovery β€” NMF clustering finds recurring themes automatically
  • Crisis Detection β€” Multi-signal scoring catches PR disasters before they escalate
  • Trend Forecasting β€” Statistical forecasting predicts sentiment trajectory
  • Competitor Intelligence β€” Tracks competitor mentions and switch signals

✨ Key Features

🧠 NLP Pipeline

  • BERT Sentiment Analysis (cardiffnlp/twitter-roberta-base-sentiment-latest)

    • Document-level sentiment (positive/negative/neutral)
    • Aspect-based sentiment extraction (Performance, Pricing, Support, UI, etc.)
    • Confidence scoring with fallback to VADER/keyword analysis
  • Topic Modeling (NMF + TF-IDF)

    • Automated topic discovery from short-text corpus
    • Named clusters with keyword extraction
    • Sentiment distribution per topic
  • Trend Analysis & Forecasting

    • Rolling statistical analysis with anomaly detection
    • Exponential smoothing for 14-day sentiment forecast
    • Volume trend analysis and spike detection
  • Crisis Detection Engine

    • Multi-signal crisis scoring (legal, data breach, outrage, viral threats)
    • Severity classification (low/medium/high/critical)
    • Engagement amplification (viral posts get higher weight)
  • Competitor Intelligence

    • Competitor mention extraction and sentiment comparison
    • Switch signal detection (users leaving competitors)
    • Opportunity gap identification

🎨 Dashboard Features

  • Real-time KPIs β€” Sentiment score, NPS estimate, volume trends, crisis alerts
  • Interactive Visualizations β€” Time series, donut charts, topic bubbles, competitor comparison
  • Topic Explorer β€” Click-to-explore topic clusters with keyword clouds
  • Crisis Radar β€” Prioritized list of high-severity posts requiring action
  • Live Analyzer β€” Real-time sentiment + aspect + crisis analysis for any text
  • Post Feed β€” Filterable feed with sentiment labels and source badges

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Frontend (Vanilla JS)                    β”‚
β”‚  β€’ Dark SaaS UI with Syne/Instrument Sans typography        β”‚
β”‚  β€’ Chart.js for time series, D3.js for topic bubbles        β”‚
β”‚  β€’ Real-time API polling, demo fallback when offline         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚ REST API
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  FastAPI Backend (Python)                    β”‚
β”‚  β€’ /api/dashboard β€” Full analytics payload                   β”‚
β”‚  β€’ /api/analyze β€” Single text sentiment + crisis scoring     β”‚
β”‚  β€’ /api/topics β€” Topic clusters with examples                β”‚
β”‚  β€’ /api/trends β€” Time series + forecast                      β”‚
β”‚  β€’ /api/competitors β€” Competitive intelligence               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β–Ό                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  NLP Pipeline     β”‚     β”‚  Sample Data Gen      β”‚
β”‚  β€’ sentiment.py   β”‚     β”‚  β€’ 500 synthetic      β”‚
β”‚  β€’ topic_model.py β”‚     β”‚    reviews/tweets     β”‚
β”‚  β€’ trends.py      β”‚     β”‚  β€’ Realistic crisis   β”‚
β”‚  β€’ crisis.py      β”‚     β”‚    scenarios          β”‚
β”‚  β€’ competitor.py  β”‚     β”‚  β€’ Time series data   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“¦ Tech Stack

Backend:

  • FastAPI β€” Modern async Python web framework
  • Transformers (Hugging Face) β€” BERT sentiment model
  • scikit-learn β€” NMF topic modeling, TF-IDF vectorization
  • NumPy/Pandas β€” Statistical analysis and data manipulation
  • NLTK β€” Fallback sentiment analysis (VADER)

Frontend:

  • Vanilla JavaScript (no framework dependencies)
  • Chart.js β€” Time series and bar/donut charts
  • D3.js β€” Topic bubble visualization
  • Custom CSS β€” Dark enterprise SaaS design system
  • Fonts: Syne (display), Instrument Sans (body), DM Mono (code)

Models:

  • Primary: cardiffnlp/twitter-roberta-base-sentiment-latest (RoBERTa fine-tuned on 124M tweets)
  • Fallback: VADER lexicon-based sentiment (works offline)

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • pip (Python package manager)
  • Modern web browser (Chrome, Firefox, Safari, Edge)

Installation

  1. Extract the project

    unzip social-intelligence-platform.zip
    cd social-intelligence-platform
    
  2. Install Python dependencies

    cd backend
    pip install -r requirements.txt
    
  3. Download NLTK data (for fallback sentiment)

    python -c "import nltk; nltk.download('vader_lexicon')"
    

Running the Application

Option 1: Run Backend + Frontend (Recommended)

Terminal 1 β€” Start Backend:

cd backend
python main.py

The backend will:

  • Start on http://localhost:8000
  • Generate 500 sample posts on startup
  • Run BERT sentiment analysis (or fallback to VADER if model unavailable)
  • Fit topic model (NMF)
  • Build trend forecasts
  • Scan for crisis signals
  • Assemble competitor intelligence

This takes 15-30 seconds on first run (model download + bootstrap).

Terminal 2 β€” Serve Frontend:

cd frontend
python -m http.server 3000

Open browser to: http://localhost:3000

Option 2: Frontend Only (Demo Mode)

If the backend is unavailable, the frontend falls back to demo data automatically.

cd frontend
python -m http.server 3000

Open browser to: http://localhost:3000

You'll see "Backend offline β€” showing demo data" during load. The dashboard will render with pre-generated synthetic data.


πŸ“Š Usage Guide

Dashboard Views

1. Dashboard (Home)

  • Overview KPIs: Sentiment score, volume, NPS estimate, crisis alert level
  • 90-day sentiment trend with forecast
  • Sentiment mix (donut chart)
  • Volume by source (Twitter, Reddit, G2, etc.)
  • Top crisis posts requiring immediate action
  • Recent post feed with filters

2. Trends

  • 7-day vs 30-day sentiment comparison
  • Trend direction (improving/declining/stable)
  • 14-day forecast with confidence bands
  • Anomaly detection (spikes and dips)
  • Daily volume trend

3. Topic Clusters

  • 8 auto-discovered topics with keyword weights
  • Interactive bubble chart (size = post volume)
  • Click to explore: top keywords, sample posts, sentiment distribution

4. Crisis Radar

  • Overall alert level (🟒 Low β†’ πŸ”΄ Critical)
  • Active high-severity posts
  • Signal frequency breakdown (legal, data breach, outrage, etc.)
  • Recommended actions

5. Competitors

  • Sentiment comparison across brands
  • Share of voice (% of corpus mentions)
  • Opportunity intelligence (AI-identified competitive gaps)
  • Switch signal detection

6. Live Analyzer

  • Paste any text for real-time analysis
  • Returns: sentiment label, confidence, crisis score, aspect breakdown
  • Quick example templates

7. Post Feed

  • Full scrollable feed with sentiment labels
  • Filter by positive/negative/neutral/crisis
  • Topic tags and source badges

API Endpoints

# Health check
GET http://localhost:8000/api/health

# Full dashboard data
GET http://localhost:8000/api/dashboard

# Summary metrics only
GET http://localhost:8000/api/summary

# Topic clusters
GET http://localhost:8000/api/topics

# Trend analysis + forecast
GET http://localhost:8000/api/trends

# Crisis scan results
GET http://localhost:8000/api/crisis

# Competitor intelligence
GET http://localhost:8000/api/competitors

# Post feed (with filters)
GET http://localhost:8000/api/posts?limit=50&sentiment=negative&source=Twitter

# Analyze single text
POST http://localhost:8000/api/analyze
Body: {"text": "Your review text here", "include_aspects": true, "include_crisis": true}

# Batch analysis
POST http://localhost:8000/api/batch-analyze
Body: {"texts": ["Review 1", "Review 2", "Review 3"]}

πŸ§ͺ Sample Data

The platform generates 500 realistic posts on startup:

  • 60% Positive β€” Praise for features, support, UI
  • 25% Negative β€” Complaints about performance, pricing, bugs
  • 10% Neutral β€” Migration stories, feature requests
  • 5% Crisis β€” Data breaches, outages, legal threats, scams

Sources: Twitter, Reddit, G2, Trustpilot, ProductHunt, AppStore, LinkedIn

Time Range: Last 90 days with recency bias (more recent posts)

Topics Covered:

  • Performance & Speed
  • Customer Support
  • Pricing & Billing
  • UI & Design
  • Features & Integrations
  • Data Quality & Accuracy
  • Onboarding & Documentation
  • Security & Compliance

Competitor Mentions: RivalOne, CompeteX, AltStream appear in ~15% of posts

Crisis Cluster: Injected 7 days ago to simulate a real brand crisis event


🎨 Design System

The UI uses a dark enterprise SaaS aesthetic inspired by Linear, Vercel, and Notion:

Colors:

  • --bg-void: #080b12 β€” Deep background
  • --bg-surface: #111827 β€” Card backgrounds
  • --blue-500: #5b9cf6 β€” Primary accent
  • --green-500: #10b981 β€” Positive sentiment
  • --red-500: #ef4444 β€” Negative sentiment / crisis
  • --amber-500: #f59e0b β€” Warnings / neutral

Typography:

  • Display (Headings): Syne β€” Bold, modern, slightly geometric
  • Body (UI Text): Instrument Sans β€” Clean, readable, professional
  • Monospace (Data): DM Mono β€” Metrics, badges, code

Layout:

  • Sidebar navigation (240px fixed)
  • Header with search and status indicators
  • Card-based grid system
  • Consistent 16px/20px/24px spacing rhythm

Animations:

  • Staggered fade-in on page load
  • Smooth chart transitions (800ms easing)
  • Hover states with subtle elevation
  • Loading states with branded skeleton screens

πŸ”§ Configuration

Backend Settings

Model Selection (in backend/nlp/sentiment.py):

MODEL_ID = "cardiffnlp/twitter-roberta-base-sentiment-latest"  # Primary model
FALLBACK_MODE = False  # Set True to skip transformer download

Topic Count (in backend/main.py):

modeler = get_modeler(n_topics=8)  # Adjust number of topics

Sample Data Size (in backend/main.py):

_corpus = generate_posts(n=500)  # Generate 500 posts (adjust as needed)

Crisis Detection Thresholds

Edit backend/nlp/crisis_detector.py:

ALERT_LEVELS = {
    (0, 4): ("low", "🟒", "No action required."),
    (4, 8): ("medium", "🟑", "Monitor closely."),
    (8, 15): ("high", "🟠", "Escalate to communications team."),
    (15, 99): ("critical", "πŸ”΄", "Activate crisis response immediately."),
}

πŸ“ˆ Performance Notes

First Run:

  • Model download: ~440MB (RoBERTa weights)
  • Bootstrap time: 15-30 seconds (sentiment + topic modeling + trends)

Subsequent Runs:

  • Model loads from cache: ~3-5 seconds
  • Bootstrap time: 5-10 seconds

Runtime Performance:

  • Sentiment analysis: ~50ms per post (transformer mode)
  • Topic modeling fit: ~2 seconds (500 posts, 8 topics)
  • Trend forecast: <1 second (90-day series)
  • Dashboard payload: ~1 second (full analysis)

Offline Mode:

  • If transformers unavailable: Falls back to VADER (100x faster)
  • If backend offline: Frontend uses demo data (instant load)

πŸš€ Production Deployment

This is a demo/portfolio project. For production use:

  1. Replace sample data with real data sources:

    • Twitter API / Reddit API / Review aggregators
    • Implement proper data ingestion pipeline
    • Add database (PostgreSQL / MongoDB) for persistence
  2. Fine-tune the BERT model on your domain:

    • Collect labeled training data from your industry
    • Fine-tune on HuggingFace Trainer
    • Deploy custom model endpoint
  3. Add authentication:

    • OAuth 2.0 / JWT tokens
    • User accounts and multi-tenancy
    • API rate limiting
  4. Scale the backend:

    • Containerize with Docker
    • Deploy to AWS/GCP/Azure
    • Add Redis cache for analytics
    • Use Celery for async NLP jobs
  5. Enhance frontend:

    • Add React/Vue for state management
    • Implement WebSocket for real-time updates
    • Add export to PDF/CSV functionality

πŸ“ Project Structure

social-intelligence-platform/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py                 # FastAPI application
β”‚   β”œβ”€β”€ requirements.txt        # Python dependencies
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── sample_data.py      # Synthetic data generator
β”‚   └── nlp/
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ sentiment.py        # BERT sentiment pipeline
β”‚       β”œβ”€β”€ topic_model.py      # NMF topic modeling
β”‚       β”œβ”€β”€ trend_analysis.py   # Time series forecasting
β”‚       β”œβ”€β”€ crisis_detector.py  # Crisis scoring engine
β”‚       └── competitor_intel.py # Competitor mention analysis
β”œβ”€β”€ frontend/
β”‚   └── index.html              # Dashboard UI (self-contained)
β”œβ”€β”€ docs/
β”‚   └── CASE_STUDY.md          # Detailed project writeup
└── README.md                   # This file

πŸŽ“ Skills Demonstrated

NLP & Machine Learning

  • βœ… BERT/Transformer fine-tuning and inference
  • βœ… Topic modeling (NMF, LDA alternatives)
  • βœ… Time series forecasting (exponential smoothing)
  • βœ… Aspect-based sentiment analysis
  • βœ… Anomaly detection (statistical outliers)
  • βœ… Multi-signal classification (crisis scoring)

Backend Engineering

  • βœ… FastAPI REST API design
  • βœ… Async Python patterns
  • βœ… Model serving and caching
  • βœ… Batch processing pipelines
  • βœ… Error handling and fallbacks

Frontend Development

  • βœ… Modern vanilla JS (no framework bloat)
  • βœ… Chart.js and D3.js visualizations
  • βœ… Responsive CSS Grid layouts
  • βœ… Design system implementation
  • βœ… Performance optimization (lazy loading, debouncing)

Product Thinking

  • βœ… Problem-first approach (not technology-first)
  • βœ… User-centered design (product teams, not ML researchers)
  • βœ… Actionable insights over raw metrics
  • βœ… Crisis prioritization and triage

πŸ“§ Questions?

This project demonstrates production-ready NLP engineering, API design, and data visualization skills. Built to solve real product team pain points with modern ML techniques.

Author: [Your Name]
Portfolio: [Your Portfolio URL]
GitHub: [Your GitHub]
LinkedIn: [Your LinkedIn]


πŸ“„ License

MIT License β€” Free to use for educational and portfolio purposes.


Built with: 🐍 Python β€’ ⚑ FastAPI β€’ πŸ€— Transformers β€’ πŸ“Š Chart.js β€’ 🎨 Custom CSS