Spaces:

plexdx
/

rwttrter

No application file

App Files Files Community

rwttrter / README.md

plexdx

Upload 26 files

64d289f verified 4 days ago

preview code

raw

history blame contribute delete

8.06 kB

metadata

title: Omnichannel Fact & Hallucination Intelligence System
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860

Omnichannel Fact & Hallucination Intelligence System

Near-zero-latency real-time fact-checking and AI hallucination detection — deployed universally via a browser extension across X/Twitter, YouTube, Instagram, news sites, and AI chat interfaces.

Architecture

Browser Extension (WXT + React 19 + Framer Motion)
         │  WebSocket (wss://)
         ▼
FastAPI Backend ──► Redis Stack (cache, 6h/15min TTL)
    │
    ├──► Gatekeeper: Groq llama3-8b-8192 (<120ms p95)
    │         └── noise → drop | fact → continue
    │
    ├──► RAG Pipeline (concurrent)
    │         ├── FastEmbed BGE-M3 embeddings (CPU, multilingual)
    │         ├── Qdrant ANN search (HNSW ef=128, top-8, 72h window)
    │         └── Memgraph trust graph traversal (in-memory Cypher)
    │
    ├──► Grok Sensor (concurrent)
    │         └── X API v2 velocity + Community Notes
    │
    └──► Prefect Flow (multi-agent evaluation)
              ├── misinformation_task: Groq mixtral-8x7b-32768
              └── hallucination_task: Claude Haiku (AI platforms only)
                        │
                        ▼
              AnalysisResult → WebSocket → Extension → DOM highlight + hover card

Stack

Layer	Technology	Why
Extension framework	WXT v0.19 + React 19	HMR, multi-browser, TypeScript-first, Vite
Extension state	Zustand + chrome.storage.sync	Persistent, reactive, cross-context
LLM gatekeeper	Groq llama3-8b-8192	800+ tok/s, <100ms, no GPU needed
LLM evaluation	LiteLLM → Groq mixtral-8x7b / llama3-70b	All free via Groq — swap providers without code changes
Embeddings	BGE-M3 via FastEmbed	100+ languages, 1024-dim, CPU-native, free
Vector DB	Qdrant (self-hosted)	Sub-ms HNSW search, no vendor lock-in
Graph DB	Memgraph (in-memory)	10–100x faster than Neo4j for trust scoring
Message queue	Redpanda	Kafka-compatible, no JVM, 10x lower latency
Orchestration	Prefect	Native async, DAG flows, built-in retry
Cache	Redis Stack (RedisJSON)	Structured claim cache, TTL per verdict color
Package manager	uv	10–100x faster than pip, lockfiles
Hashing	xxhash (client + server)	Sub-microsecond content deduplication
Edge tunnel	Cloudflare Tunnel	Zero-config TLS, no exposed ports
Observability	structlog + rich	Structured JSON logs, colorized dev output

Quick Start (HuggingFace Spaces)

This Space runs the backend + demo UI via Docker. The browser extension is a separate build.

Required Secrets (set in Space settings → Secrets)

Secret	Required	Description
`GROQ_API_KEY`	Recommended	Groq API key — powers all 3 LLM agents (gatekeeper, misinformation, hallucination). Free tier: 30 req/min
`X_BEARER_TOKEN`	Optional	X API v2 bearer token for tweet velocity + Community Notes

Without any API keys: The system runs in DEMO_MODE=true with deterministic mock results — great for exploring the UI and architecture without credentials.

Get a free key:

Groq: https://console.groq.com (free tier: 30 req/min — covers all 3 LLM agents)

Run Locally

git clone <repo>
cd omnichannel-fact-intelligence

# Copy env template
cp .env.example .env
# Edit .env with your API keys

# Start all services (Qdrant, Memgraph, Redpanda, Redis, FastAPI)
docker compose up

# Visit http://localhost:7860 for the demo UI

Run Backend Only (no Docker for infra)

cd backend

# Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync

# Set env vars
export GROQ_API_KEY=your_key
export DEMO_MODE=true  # Skip infrastructure deps for quick testing

# Start FastAPI
uv run uvicorn main:app --host 0.0.0.0 --port 7860 --reload

Browser Extension Setup

Prerequisites

cd extension
npm install  # or: bun install

Development (Chrome)

# Set your backend URL (or use cloudflared tunnel)
WS_URL=ws://localhost:7860/ws npx wxt dev --browser chrome

Production Build

# Build for all browsers
WS_URL=wss://fact-engine.your-domain.com/ws npx wxt build

# Chrome: .output/chrome-mv3/
# Firefox: .output/firefox-mv3/

Load in Chrome

Navigate to chrome://extensions
Enable Developer mode (top right)
Click Load unpacked → select .output/chrome-mv3/
Visit X/Twitter, YouTube, or any news site — facts will begin highlighting

Highlight Color Semantics

Color	Hex	Meaning
🟢 Green	`#22c55e`	Fact-checked — corroborated by ≥2 sources, trust score ≥ 0.65
🟡 Yellow	`#eab308`	Unverified — breaking news, weak corroboration, high velocity
🔴 Red	`#ef4444`	Debunked — refuted by ≥2 independent sources or Community Note active
🟣 Purple	`#a855f7`	AI hallucination — fabricated citation, impossibility, contradiction

Trust Score Algorithm

score = 0.5 (baseline)
+ 0.30  if Author.verified AND account_type IN ['government', 'official_news']
+ 0.05  per corroborating Source node (capped at +0.25, i.e. 5 sources)
- 0.40  if any Source has an active Community Note
= clamp(score, 0.0, 1.0)

Data Pipeline

Three async Redpanda producers simulate the omnichannel firehose:

Producer	Topic	Rate	Source
twitter_producer	`raw.twitter`	50 eps	Mock X posts
instagram_producer	`raw.instagram`	20 eps	Mock story text (OCR-extracted)
youtube_producer	`raw.youtube`	10 eps	Mock VTT transcript chunks

A single async consumer aggregates all three, deduplicates by content_hash, and upserts into Qdrant + Memgraph.

Extension Modes

Mode	Shows
Minimal	Red + Purple only
Normal (default)	Red + Purple + Yellow
Advanced	All colors including Green

File Structure

omnichannel-fact-intelligence/
├── docker-compose.yml              # All services in one command
├── .env.example                    # Environment template
│
├── backend/
│   ├── Dockerfile                  # uv + Python 3.12
│   ├── pyproject.toml              # All deps pinned (uv-compatible)
│   ├── main.py                     # FastAPI app, WebSocket, Redis cache
│   ├── gatekeeper.py               # Groq fact/noise classifier (<120ms p95)
│   ├── rag_pipeline.py             # BGE-M3 + Qdrant + Memgraph trust graph
│   ├── grok_sensor.py              # X API v2 + Community Notes
│   ├── agents.py                   # Prefect flow + LiteLLM multi-agent eval
│   ├── core/
│   │   ├── config.py               # Pydantic-settings centralized config
│   │   └── models.py               # All Pydantic v2 models
│   ├── producers/
│   │   └── producers.py            # Twitter + Instagram + YouTube + consumer
│   └── static/
│       └── index.html              # Demo UI (served at /)
│
├── extension/
│   ├── wxt.config.ts               # WXT framework config
│   ├── stores/
│   │   └── extensionStore.ts       # Zustand + chrome.storage.sync
│   └── entrypoints/
│       ├── background.ts           # Persistent WS connection + message routing
│       ├── content.tsx             # MutationObserver + highlight + hover card
│       └── popup.tsx               # Master toggle + mode selector + badge
│
└── infra/
    └── tunnel_setup.sh             # Cloudflare Tunnel setup script

License

MIT — see LICENSE for details.