rwttrter / README.md
plexdx's picture
Upload 26 files
64d289f verified
metadata
title: Omnichannel Fact & Hallucination Intelligence System
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860

Omnichannel Fact & Hallucination Intelligence System

Near-zero-latency real-time fact-checking and AI hallucination detection β€” deployed universally via a browser extension across X/Twitter, YouTube, Instagram, news sites, and AI chat interfaces.


Architecture

Browser Extension (WXT + React 19 + Framer Motion)
         β”‚  WebSocket (wss://)
         β–Ό
FastAPI Backend ──► Redis Stack (cache, 6h/15min TTL)
    β”‚
    β”œβ”€β”€β–Ί Gatekeeper: Groq llama3-8b-8192 (<120ms p95)
    β”‚         └── noise β†’ drop | fact β†’ continue
    β”‚
    β”œβ”€β”€β–Ί RAG Pipeline (concurrent)
    β”‚         β”œβ”€β”€ FastEmbed BGE-M3 embeddings (CPU, multilingual)
    β”‚         β”œβ”€β”€ Qdrant ANN search (HNSW ef=128, top-8, 72h window)
    β”‚         └── Memgraph trust graph traversal (in-memory Cypher)
    β”‚
    β”œβ”€β”€β–Ί Grok Sensor (concurrent)
    β”‚         └── X API v2 velocity + Community Notes
    β”‚
    └──► Prefect Flow (multi-agent evaluation)
              β”œβ”€β”€ misinformation_task: Groq mixtral-8x7b-32768
              └── hallucination_task: Claude Haiku (AI platforms only)
                        β”‚
                        β–Ό
              AnalysisResult β†’ WebSocket β†’ Extension β†’ DOM highlight + hover card

Stack

Layer Technology Why
Extension framework WXT v0.19 + React 19 HMR, multi-browser, TypeScript-first, Vite
Extension state Zustand + chrome.storage.sync Persistent, reactive, cross-context
LLM gatekeeper Groq llama3-8b-8192 800+ tok/s, <100ms, no GPU needed
LLM evaluation LiteLLM β†’ Groq mixtral-8x7b / llama3-70b All free via Groq β€” swap providers without code changes
Embeddings BGE-M3 via FastEmbed 100+ languages, 1024-dim, CPU-native, free
Vector DB Qdrant (self-hosted) Sub-ms HNSW search, no vendor lock-in
Graph DB Memgraph (in-memory) 10–100x faster than Neo4j for trust scoring
Message queue Redpanda Kafka-compatible, no JVM, 10x lower latency
Orchestration Prefect Native async, DAG flows, built-in retry
Cache Redis Stack (RedisJSON) Structured claim cache, TTL per verdict color
Package manager uv 10–100x faster than pip, lockfiles
Hashing xxhash (client + server) Sub-microsecond content deduplication
Edge tunnel Cloudflare Tunnel Zero-config TLS, no exposed ports
Observability structlog + rich Structured JSON logs, colorized dev output

Quick Start (HuggingFace Spaces)

This Space runs the backend + demo UI via Docker. The browser extension is a separate build.

Required Secrets (set in Space settings β†’ Secrets)

Secret Required Description
GROQ_API_KEY Recommended Groq API key β€” powers all 3 LLM agents (gatekeeper, misinformation, hallucination). Free tier: 30 req/min
X_BEARER_TOKEN Optional X API v2 bearer token for tweet velocity + Community Notes

Without any API keys: The system runs in DEMO_MODE=true with deterministic mock results β€” great for exploring the UI and architecture without credentials.

Get a free key:

Run Locally

git clone <repo>
cd omnichannel-fact-intelligence

# Copy env template
cp .env.example .env
# Edit .env with your API keys

# Start all services (Qdrant, Memgraph, Redpanda, Redis, FastAPI)
docker compose up

# Visit http://localhost:7860 for the demo UI

Run Backend Only (no Docker for infra)

cd backend

# Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync

# Set env vars
export GROQ_API_KEY=your_key
export DEMO_MODE=true  # Skip infrastructure deps for quick testing

# Start FastAPI
uv run uvicorn main:app --host 0.0.0.0 --port 7860 --reload

Browser Extension Setup

Prerequisites

cd extension
npm install  # or: bun install

Development (Chrome)

# Set your backend URL (or use cloudflared tunnel)
WS_URL=ws://localhost:7860/ws npx wxt dev --browser chrome

Production Build

# Build for all browsers
WS_URL=wss://fact-engine.your-domain.com/ws npx wxt build

# Chrome: .output/chrome-mv3/
# Firefox: .output/firefox-mv3/

Load in Chrome

  1. Navigate to chrome://extensions
  2. Enable Developer mode (top right)
  3. Click Load unpacked β†’ select .output/chrome-mv3/
  4. Visit X/Twitter, YouTube, or any news site β€” facts will begin highlighting

Highlight Color Semantics

Color Hex Meaning
🟒 Green #22c55e Fact-checked β€” corroborated by β‰₯2 sources, trust score β‰₯ 0.65
🟑 Yellow #eab308 Unverified β€” breaking news, weak corroboration, high velocity
πŸ”΄ Red #ef4444 Debunked β€” refuted by β‰₯2 independent sources or Community Note active
🟣 Purple #a855f7 AI hallucination β€” fabricated citation, impossibility, contradiction

Trust Score Algorithm

score = 0.5 (baseline)
+ 0.30  if Author.verified AND account_type IN ['government', 'official_news']
+ 0.05  per corroborating Source node (capped at +0.25, i.e. 5 sources)
- 0.40  if any Source has an active Community Note
= clamp(score, 0.0, 1.0)

Data Pipeline

Three async Redpanda producers simulate the omnichannel firehose:

Producer Topic Rate Source
twitter_producer raw.twitter 50 eps Mock X posts
instagram_producer raw.instagram 20 eps Mock story text (OCR-extracted)
youtube_producer raw.youtube 10 eps Mock VTT transcript chunks

A single async consumer aggregates all three, deduplicates by content_hash, and upserts into Qdrant + Memgraph.


Extension Modes

Mode Shows
Minimal Red + Purple only
Normal (default) Red + Purple + Yellow
Advanced All colors including Green

File Structure

omnichannel-fact-intelligence/
β”œβ”€β”€ docker-compose.yml              # All services in one command
β”œβ”€β”€ .env.example                    # Environment template
β”‚
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ Dockerfile                  # uv + Python 3.12
β”‚   β”œβ”€β”€ pyproject.toml              # All deps pinned (uv-compatible)
β”‚   β”œβ”€β”€ main.py                     # FastAPI app, WebSocket, Redis cache
β”‚   β”œβ”€β”€ gatekeeper.py               # Groq fact/noise classifier (<120ms p95)
β”‚   β”œβ”€β”€ rag_pipeline.py             # BGE-M3 + Qdrant + Memgraph trust graph
β”‚   β”œβ”€β”€ grok_sensor.py              # X API v2 + Community Notes
β”‚   β”œβ”€β”€ agents.py                   # Prefect flow + LiteLLM multi-agent eval
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ config.py               # Pydantic-settings centralized config
β”‚   β”‚   └── models.py               # All Pydantic v2 models
β”‚   β”œβ”€β”€ producers/
β”‚   β”‚   └── producers.py            # Twitter + Instagram + YouTube + consumer
β”‚   └── static/
β”‚       └── index.html              # Demo UI (served at /)
β”‚
β”œβ”€β”€ extension/
β”‚   β”œβ”€β”€ wxt.config.ts               # WXT framework config
β”‚   β”œβ”€β”€ stores/
β”‚   β”‚   └── extensionStore.ts       # Zustand + chrome.storage.sync
β”‚   └── entrypoints/
β”‚       β”œβ”€β”€ background.ts           # Persistent WS connection + message routing
β”‚       β”œβ”€β”€ content.tsx             # MutationObserver + highlight + hover card
β”‚       └── popup.tsx               # Master toggle + mode selector + badge
β”‚
└── infra/
    └── tunnel_setup.sh             # Cloudflare Tunnel setup script

License

MIT β€” see LICENSE for details.