Spaces:
Sleeping
A newer version of the Gradio SDK is available: 6.14.0
title: RetailMind
emoji: ποΈ
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.13.0
python_version: '3.10'
hf_transfer: true
app_file: app.py
pinned: false
allow_api: false
π§ RetailMind
Self-Healing LLM for Store Intelligence
An autonomous e-commerce AI that detects semantic drift in user intent and self-heals its own prompt in real time β no human in the loop.
What This Demonstrates
| Skill | Implementation |
|---|---|
| MLOps / Observability | Real-time EWMA drift detection with live telemetry chart |
| RAG / Retrieval | Hybrid: metadata pre-filter (price, category) + dense semantic re-ranking |
| Prompt Engineering | Anti-hallucination grounding; dynamic system prompt injection on drift |
| Self-Healing Systems | Autonomous prompt rewriting when intent distribution shifts β zero human intervention |
| LLM Integration | HF Inference API (Qwen2.5-72B) for fast, grounded product recommendations |
| Software Engineering | Type hints, logging, pytest suite, CI/CD, modular architecture |
Architecture
graph LR
A["π User Query"] --> B["π Drift Detector<br/><i>EWMA Semantic Analysis</i>"]
A --> C["π Hybrid Retriever<br/><i>Price Filter + Dense Search</i>"]
B --> D["π§ Self-Healing Adapter<br/><i>Dynamic Prompt Mutation</i>"]
C --> E["π€ LLM<br/><i>Qwen2.5-72B via HF API</i>"]
D --> E
E --> F["π¬ Grounded Response"]
B --> G["π Telemetry Dashboard<br/><i>Live EWMA Charts</i>"]
RetailMind/
βββ app.py # Gradio UI β 3-panel dashboard
βββ modules/
β βββ shared.py # Shared SentenceTransformer singleton
β βββ data_simulation.py # Curated product catalog with rich metadata
β βββ retrieval.py # Hybrid retriever (price-filter β semantic re-rank)
β βββ drift.py # EWMA-based semantic drift detector
β βββ adaptation.py # Self-healing prompt adapter
β βββ llm.py # HF Inference API client
βββ tests/ # pytest suite
βββ .github/workflows/ci.yml # CI pipeline (Python 3.10β3.12)
βββ requirements.txt
How the Self-Healing Loop Works
The system monitors semantic similarity between incoming queries and concept anchors using an Exponentially Weighted Moving Average (EWMA). When a concept's EWMA score crosses a threshold, the system rewrites its own instructions β instantly and autonomously.
| Concept | Example Triggers | What Changes |
|---|---|---|
| π° Price Sensitive | "cheapest", "under $30", "budget" | Prioritise lowest-price items, highlight savings |
| βοΈ Summer Shift | "beach", "UV", "hot weather" | Surface breathable/outdoor products |
| πΏ Eco Trend | "sustainable", "recycled", "organic" | Lead with eco-credentials and certifications |
Key insight: Matching is semantic, not keyword-based. "I care about the planet" triggers the eco adaptation even though it contains no eco keywords β because it's semantically close to the concept anchor embedding.
Hybrid Retrieval
Pure semantic search fails on structured queries like "bags under $25" β a $200 bag and a $20 bag may be equally relevant semantically. RetailMind solves this with a two-stage pipeline:
- NLU extraction β regex parses price ceilings (
"under $50","budget of $30","cheapest") - Category detection β maps query terms to catalog categories
- Pre-filter β removes violating products before any embedding work
- Semantic re-rank β cosine similarity on
all-MiniLM-L6-v2embeddings ranks survivors
# "eco-friendly bag under $30"
# β price_cap=30, category="eco-friendly"
# β 68 products β 6 candidates β top 4 by semantic similarity
Demo Walkthrough
Run through the four scenario phases in order:
- Phase 1 β Normal General product questions. System responds in balanced mode.
- Phase 2 β Black Friday Budget queries. Watch the gold drift line spike above the threshold. Price-prioritisation rules auto-inject.
- Phase 3 β Summer Shift Summer queries. Cyan line rises; system pivots to warm-weather products without being told.
- Phase 4 β Eco Trend Sustainability queries. Green line triggers; system starts citing certifications and materials.
The telemetry panel shows exactly what's happening: which drift was detected, what prompt rules were injected, and why.
Quick Start
git clone https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence.git
cd -RetailMind-Self-Healing-LLM-for-Store-Intelligence
pip install -r requirements.txt
HF_TOKEN=your_token python app.py
pytest tests/ -v
Tech Stack
| Component | Technology |
|---|---|
| UI | Gradio 5.x |
| LLM | Qwen2.5-72B-Instruct via HF Inference API |
| Embeddings | SentenceTransformers Β· all-MiniLM-L6-v2 |
| Retrieval | Hybrid (NumPy cosine + metadata pre-filter) |
| Drift Detection | EWMA over sentence embeddings |
| Charting | Plotly |
| Testing | pytest |
| CI/CD | GitHub Actions |
| Language | Python 3.10+ |
Key Design Decisions
| Decision | Rationale |
|---|---|
| EWMA over raw scores | Single-query similarity is noisy. EWMA smooths the signal so the system doesn't flip modes on every query. Ξ±=0.35 balances reactivity with stability. |
| Hybrid retrieval over pure semantic | Semantic search alone can't enforce price constraints. Pre-filtering handles hard constraints before the expensive embedding step. |
| Prompt injection over fine-tuning | Dynamic prompt injection achieves the same behavioural shift as fine-tuning with zero training cost and instant reversibility. |
| Shared embedding singleton | Both the retriever and drift detector share one SentenceTransformer instance, and the query is encoded once per request β eliminating redundant computation. |
