Spaces:
Sleeping
Sleeping
docs: update README for recruiters — demo link, correct tech stack, update tech stack and links
Browse files
README.md
CHANGED
|
@@ -19,190 +19,149 @@ allow_api: false
|
|
| 19 |
|
| 20 |
[](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions)
|
| 21 |
[](https://python.org)
|
| 22 |
-
[](https://hodfa71-retailmind.hf.space/)
|
| 25 |
|
| 26 |
-
**An autonomous e-commerce AI that detects semantic drift in user intent and self-heals its own
|
| 27 |
|
| 28 |
-
[
|
| 29 |
|
| 30 |
</div>
|
| 31 |
|
| 32 |
---
|
| 33 |
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
| Skill | Implementation |
|
| 37 |
-
|-------|---------------|
|
| 38 |
-
| **MLOps / Observability** | Real-time EWMA
|
| 39 |
-
| **RAG /
|
| 40 |
-
| **Prompt Engineering** | Anti-hallucination grounding
|
| 41 |
| **Self-Healing Systems** | Autonomous prompt rewriting when intent distribution shifts — zero human intervention |
|
| 42 |
-
| **LLM Integration** |
|
| 43 |
-
| **Software Engineering** | Type hints,
|
| 44 |
|
| 45 |
---
|
| 46 |
|
| 47 |
-
##
|
| 48 |
|
| 49 |
```mermaid
|
| 50 |
graph LR
|
| 51 |
A["🛒 User Query"] --> B["📊 Drift Detector<br/><i>EWMA Semantic Analysis</i>"]
|
| 52 |
A --> C["🔍 Hybrid Retriever<br/><i>Price Filter + Dense Search</i>"]
|
| 53 |
B --> D["🔧 Self-Healing Adapter<br/><i>Dynamic Prompt Mutation</i>"]
|
| 54 |
-
C --> E["🤖
|
| 55 |
D --> E
|
| 56 |
E --> F["💬 Grounded Response"]
|
| 57 |
B --> G["📈 Telemetry Dashboard<br/><i>Live EWMA Charts</i>"]
|
| 58 |
```
|
| 59 |
|
| 60 |
-
### Module Breakdown
|
| 61 |
-
|
| 62 |
```
|
| 63 |
RetailMind/
|
| 64 |
├── app.py # Gradio UI — 3-panel dashboard
|
| 65 |
├── modules/
|
| 66 |
-
│ ├──
|
|
|
|
| 67 |
│ ├── retrieval.py # Hybrid retriever (price-filter → semantic re-rank)
|
| 68 |
│ ├── drift.py # EWMA-based semantic drift detector
|
| 69 |
│ ├── adaptation.py # Self-healing prompt adapter
|
| 70 |
-
│ └── llm.py #
|
| 71 |
-
├── tests/ # pytest suite
|
| 72 |
-
├── .github/workflows/ci.yml # CI pipeline (
|
| 73 |
└── requirements.txt
|
| 74 |
```
|
| 75 |
|
| 76 |
---
|
| 77 |
|
| 78 |
-
##
|
| 79 |
-
|
| 80 |
-
The system continuously monitors the **semantic similarity** between incoming queries and predefined concept anchors using an **Exponentially Weighted Moving Average (EWMA)**.
|
| 81 |
|
| 82 |
-
|
| 83 |
-
Normal Mode Drift Detected!
|
| 84 |
-
┌──────────┐ ┌──────────────┐
|
| 85 |
-
User asks about │ Balanced │ EWMA crosses 0.38 → │ Auto-Inject │
|
| 86 |
-
random products → │ Prompt │ ──────────────────────── │ New Rules │
|
| 87 |
-
└──────────┘ └──────────────┘
|
| 88 |
-
│
|
| 89 |
-
┌──────────┐ ▼
|
| 90 |
-
│ LLM now │ ◄─── Prompt mutated to prioritize
|
| 91 |
-
│ focuses │ price / season / sustainability
|
| 92 |
-
│ on drift │ based on detected pattern
|
| 93 |
-
└──────────┘
|
| 94 |
-
```
|
| 95 |
-
|
| 96 |
-
### Concept Anchors
|
| 97 |
|
| 98 |
-
| Concept |
|
| 99 |
-
|---------|-----------------|------------|
|
| 100 |
-
| 💰
|
| 101 |
-
| ☀️
|
| 102 |
-
| 🌿
|
| 103 |
|
| 104 |
-
**Key insight:**
|
| 105 |
|
| 106 |
---
|
| 107 |
|
| 108 |
-
##
|
| 109 |
|
| 110 |
-
|
| 111 |
|
| 112 |
-
1. **
|
| 113 |
-
2. **Category
|
| 114 |
-
3. **Pre-
|
| 115 |
-
4. **Semantic
|
| 116 |
|
| 117 |
```python
|
| 118 |
-
#
|
| 119 |
-
#
|
| 120 |
-
#
|
| 121 |
-
# Step 3: 200 products → ~8 candidates (eco + under $30)
|
| 122 |
-
# Step 4: Rank 8 candidates by semantic similarity → top 4
|
| 123 |
```
|
| 124 |
|
| 125 |
---
|
| 126 |
|
| 127 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 128 |
|
| 129 |
-
|
| 130 |
-
- Python 3.10+
|
| 131 |
-
- ~2 GB disk space (for model weights on first run)
|
| 132 |
|
| 133 |
-
|
|
|
|
|
|
|
| 134 |
|
| 135 |
```bash
|
| 136 |
git clone https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence.git
|
| 137 |
cd -RetailMind-Self-Healing-LLM-for-Store-Intelligence
|
| 138 |
pip install -r requirements.txt
|
|
|
|
| 139 |
```
|
| 140 |
|
| 141 |
-
### Run
|
| 142 |
-
|
| 143 |
-
```bash
|
| 144 |
-
python app.py
|
| 145 |
-
```
|
| 146 |
-
|
| 147 |
-
The app launches at `http://localhost:7860` with a public share link.
|
| 148 |
-
|
| 149 |
-
### Run Tests
|
| 150 |
-
|
| 151 |
```bash
|
| 152 |
-
pip install pytest
|
| 153 |
pytest tests/ -v
|
| 154 |
```
|
| 155 |
|
| 156 |
---
|
| 157 |
|
| 158 |
-
##
|
| 159 |
-
|
| 160 |
-
To see the self-healing system in action:
|
| 161 |
-
|
| 162 |
-
1. **Phase 1 (Normal)** — Ask general product questions. The system responds in balanced mode.
|
| 163 |
-
2. **Phase 2 (Black Friday)** — Click budget-oriented queries. Watch the drift chart's gold line spike above the threshold. The system auto-injects price-prioritization rules.
|
| 164 |
-
3. **Phase 3 (Summer)** — Switch to summer queries. The cyan line rises, and the system pivots to warm-weather products — *without being told to*.
|
| 165 |
-
4. **Phase 4 (Eco)** — Ask about sustainability. The green line triggers, and the system starts citing certifications and materials.
|
| 166 |
-
|
| 167 |
-
> The telemetry panel on the right shows exactly what's happening under the hood — which drift was detected, what prompt rules were injected, and why.
|
| 168 |
-
|
| 169 |
-
---
|
| 170 |
-
|
| 171 |
-
## 🧭 Technical Decisions
|
| 172 |
-
|
| 173 |
-
| Decision | Rationale |
|
| 174 |
-
|----------|-----------|
|
| 175 |
-
| **Qwen2.5-0.5B on CPU** | Eliminates API dependency, runs on any machine, no token needed. Trades quality for reliability — acceptable since grounding handles accuracy. |
|
| 176 |
-
| **EWMA over raw scores** | Single-query similarity is noisy. EWMA smooths the signal so the system doesn't flip between modes on every query. α=0.35 balances reactivity with stability. |
|
| 177 |
-
| **Hybrid retrieval over pure semantic** | Semantic search alone can't handle price constraints. A $200 jacket and a $20 hat may both be semantically relevant to "winter gear under $25" — only the pre-filter catches this. |
|
| 178 |
-
| **SentenceTransformers (all-MiniLM-L6-v2)** | 80MB model, runs on CPU in <50ms per query. Good enough for 200-product catalog. Would swap to a larger model for production scale. |
|
| 179 |
-
| **200 curated products over 1,500 generated** | Quality embeddings require quality descriptions. 200 hand-authored products with unique specs outperform 1,500 template-generated items where retrieval can't distinguish between them. |
|
| 180 |
-
| **Prompt injection over fine-tuning** | Fine-tuning a 0.5B model per drift state is impractical. Dynamic prompt injection achieves the same behavioral shift with zero training cost and instant reversibility. |
|
| 181 |
-
|
| 182 |
-
---
|
| 183 |
-
|
| 184 |
-
## 🔮 Future Roadmap
|
| 185 |
-
|
| 186 |
-
- [ ] **Multi-turn memory** — Track user preferences across conversation turns
|
| 187 |
-
- [ ] **A/B testing framework** — Compare adapted vs. baseline responses
|
| 188 |
-
- [ ] **Drift alerting** — Webhook notifications when drift exceeds critical thresholds
|
| 189 |
-
- [ ] **Vector database** — Migrate from in-memory NumPy to FAISS/Qdrant for scale
|
| 190 |
-
- [ ] **User feedback loop** — Incorporate thumbs-up/down into drift calibration
|
| 191 |
-
|
| 192 |
-
---
|
| 193 |
-
|
| 194 |
-
## 🛠️ Tech Stack
|
| 195 |
|
| 196 |
| Component | Technology |
|
| 197 |
|-----------|-----------|
|
| 198 |
-
| UI
|
| 199 |
-
| LLM |
|
| 200 |
-
| Embeddings | SentenceTransformers
|
| 201 |
| Retrieval | Hybrid (NumPy cosine + metadata pre-filter) |
|
|
|
|
| 202 |
| Charting | Plotly |
|
| 203 |
| Testing | pytest |
|
| 204 |
| CI/CD | GitHub Actions |
|
| 205 |
-
| Language | Python 3.10+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 206 |
|
| 207 |
---
|
| 208 |
|
|
|
|
| 19 |
|
| 20 |
[](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions)
|
| 21 |
[](https://python.org)
|
| 22 |
+
[](https://gradio.app)
|
| 23 |
+
[](https://huggingface.co/spaces/Hodfa71/RetailMind)
|
|
|
|
| 24 |
|
| 25 |
+
**An autonomous e-commerce AI that detects semantic drift in user intent and self-heals its own prompt in real time — no human in the loop.**
|
| 26 |
|
| 27 |
+
[**▶ Try the live demo**](https://huggingface.co/spaces/Hodfa71/RetailMind)
|
| 28 |
|
| 29 |
</div>
|
| 30 |
|
| 31 |
---
|
| 32 |
|
| 33 |
+
<!-- Replace the line below with your recorded GIF -->
|
| 34 |
+
<!-- To record: open the demo, use LICEcap / ScreenToGif / Kap, run through Phase 1→4, save as demo.gif, then: git add demo.gif && git commit -m "add demo gif" && git push -->
|
| 35 |
+

|
| 36 |
+
|
| 37 |
+
---
|
| 38 |
+
|
| 39 |
+
## What This Demonstrates
|
| 40 |
|
| 41 |
| Skill | Implementation |
|
| 42 |
+
|-------|----------------|
|
| 43 |
+
| **MLOps / Observability** | Real-time EWMA drift detection with live telemetry chart |
|
| 44 |
+
| **RAG / Retrieval** | Hybrid: metadata pre-filter (price, category) + dense semantic re-ranking |
|
| 45 |
+
| **Prompt Engineering** | Anti-hallucination grounding; dynamic system prompt injection on drift |
|
| 46 |
| **Self-Healing Systems** | Autonomous prompt rewriting when intent distribution shifts — zero human intervention |
|
| 47 |
+
| **LLM Integration** | HF Inference API (Qwen2.5-72B) for fast, grounded product recommendations |
|
| 48 |
+
| **Software Engineering** | Type hints, logging, pytest suite, CI/CD, modular architecture |
|
| 49 |
|
| 50 |
---
|
| 51 |
|
| 52 |
+
## Architecture
|
| 53 |
|
| 54 |
```mermaid
|
| 55 |
graph LR
|
| 56 |
A["🛒 User Query"] --> B["📊 Drift Detector<br/><i>EWMA Semantic Analysis</i>"]
|
| 57 |
A --> C["🔍 Hybrid Retriever<br/><i>Price Filter + Dense Search</i>"]
|
| 58 |
B --> D["🔧 Self-Healing Adapter<br/><i>Dynamic Prompt Mutation</i>"]
|
| 59 |
+
C --> E["🤖 LLM<br/><i>Qwen2.5-72B via HF API</i>"]
|
| 60 |
D --> E
|
| 61 |
E --> F["💬 Grounded Response"]
|
| 62 |
B --> G["📈 Telemetry Dashboard<br/><i>Live EWMA Charts</i>"]
|
| 63 |
```
|
| 64 |
|
|
|
|
|
|
|
| 65 |
```
|
| 66 |
RetailMind/
|
| 67 |
├── app.py # Gradio UI — 3-panel dashboard
|
| 68 |
├── modules/
|
| 69 |
+
│ ├── shared.py # Shared SentenceTransformer singleton
|
| 70 |
+
│ ├── data_simulation.py # Curated product catalog with rich metadata
|
| 71 |
│ ├── retrieval.py # Hybrid retriever (price-filter → semantic re-rank)
|
| 72 |
│ ├── drift.py # EWMA-based semantic drift detector
|
| 73 |
│ ├── adaptation.py # Self-healing prompt adapter
|
| 74 |
+
│ └── llm.py # HF Inference API client
|
| 75 |
+
├── tests/ # pytest suite
|
| 76 |
+
├── .github/workflows/ci.yml # CI pipeline (Python 3.10–3.12)
|
| 77 |
└── requirements.txt
|
| 78 |
```
|
| 79 |
|
| 80 |
---
|
| 81 |
|
| 82 |
+
## How the Self-Healing Loop Works
|
|
|
|
|
|
|
| 83 |
|
| 84 |
+
The system monitors **semantic similarity** between incoming queries and concept anchors using an **Exponentially Weighted Moving Average (EWMA)**. When a concept's EWMA score crosses a threshold, the system rewrites its own instructions — instantly and autonomously.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
|
| 86 |
+
| Concept | Example Triggers | What Changes |
|
| 87 |
+
|---------|-----------------|--------------|
|
| 88 |
+
| 💰 Price Sensitive | *"cheapest", "under $30", "budget"* | Prioritise lowest-price items, highlight savings |
|
| 89 |
+
| ☀️ Summer Shift | *"beach", "UV", "hot weather"* | Surface breathable/outdoor products |
|
| 90 |
+
| 🌿 Eco Trend | *"sustainable", "recycled", "organic"* | Lead with eco-credentials and certifications |
|
| 91 |
|
| 92 |
+
**Key insight:** Matching is semantic, not keyword-based. *"I care about the planet"* triggers the eco adaptation even though it contains no eco keywords — because it's semantically close to the concept anchor embedding.
|
| 93 |
|
| 94 |
---
|
| 95 |
|
| 96 |
+
## Hybrid Retrieval
|
| 97 |
|
| 98 |
+
Pure semantic search fails on structured queries like *"bags under $25"* — a $200 bag and a $20 bag may be equally relevant semantically. RetailMind solves this with a two-stage pipeline:
|
| 99 |
|
| 100 |
+
1. **NLU extraction** — regex parses price ceilings (`"under $50"`, `"budget of $30"`, `"cheapest"`)
|
| 101 |
+
2. **Category detection** — maps query terms to catalog categories
|
| 102 |
+
3. **Pre-filter** — removes violating products before any embedding work
|
| 103 |
+
4. **Semantic re-rank** — cosine similarity on `all-MiniLM-L6-v2` embeddings ranks survivors
|
| 104 |
|
| 105 |
```python
|
| 106 |
+
# "eco-friendly bag under $30"
|
| 107 |
+
# → price_cap=30, category="eco-friendly"
|
| 108 |
+
# → 68 products → 6 candidates → top 4 by semantic similarity
|
|
|
|
|
|
|
| 109 |
```
|
| 110 |
|
| 111 |
---
|
| 112 |
|
| 113 |
+
## Demo Walkthrough
|
| 114 |
+
|
| 115 |
+
Run through the four scenario phases in order:
|
| 116 |
+
|
| 117 |
+
1. **Phase 1 — Normal** General product questions. System responds in balanced mode.
|
| 118 |
+
2. **Phase 2 — Black Friday** Budget queries. Watch the gold drift line spike above the threshold. Price-prioritisation rules auto-inject.
|
| 119 |
+
3. **Phase 3 — Summer Shift** Summer queries. Cyan line rises; system pivots to warm-weather products without being told.
|
| 120 |
+
4. **Phase 4 — Eco Trend** Sustainability queries. Green line triggers; system starts citing certifications and materials.
|
| 121 |
|
| 122 |
+
The telemetry panel shows exactly what's happening: which drift was detected, what prompt rules were injected, and why.
|
|
|
|
|
|
|
| 123 |
|
| 124 |
+
---
|
| 125 |
+
|
| 126 |
+
## Quick Start
|
| 127 |
|
| 128 |
```bash
|
| 129 |
git clone https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence.git
|
| 130 |
cd -RetailMind-Self-Healing-LLM-for-Store-Intelligence
|
| 131 |
pip install -r requirements.txt
|
| 132 |
+
HF_TOKEN=your_token python app.py
|
| 133 |
```
|
| 134 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
```bash
|
|
|
|
| 136 |
pytest tests/ -v
|
| 137 |
```
|
| 138 |
|
| 139 |
---
|
| 140 |
|
| 141 |
+
## Tech Stack
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 142 |
|
| 143 |
| Component | Technology |
|
| 144 |
|-----------|-----------|
|
| 145 |
+
| UI | Gradio 5.x |
|
| 146 |
+
| LLM | Qwen2.5-72B-Instruct via HF Inference API |
|
| 147 |
+
| Embeddings | SentenceTransformers · all-MiniLM-L6-v2 |
|
| 148 |
| Retrieval | Hybrid (NumPy cosine + metadata pre-filter) |
|
| 149 |
+
| Drift Detection | EWMA over sentence embeddings |
|
| 150 |
| Charting | Plotly |
|
| 151 |
| Testing | pytest |
|
| 152 |
| CI/CD | GitHub Actions |
|
| 153 |
+
| Language | Python 3.10+ |
|
| 154 |
+
|
| 155 |
+
---
|
| 156 |
+
|
| 157 |
+
## Key Design Decisions
|
| 158 |
+
|
| 159 |
+
| Decision | Rationale |
|
| 160 |
+
|----------|-----------|
|
| 161 |
+
| **EWMA over raw scores** | Single-query similarity is noisy. EWMA smooths the signal so the system doesn't flip modes on every query. α=0.35 balances reactivity with stability. |
|
| 162 |
+
| **Hybrid retrieval over pure semantic** | Semantic search alone can't enforce price constraints. Pre-filtering handles hard constraints before the expensive embedding step. |
|
| 163 |
+
| **Prompt injection over fine-tuning** | Dynamic prompt injection achieves the same behavioural shift as fine-tuning with zero training cost and instant reversibility. |
|
| 164 |
+
| **Shared embedding singleton** | Both the retriever and drift detector share one `SentenceTransformer` instance, and the query is encoded once per request — eliminating redundant computation. |
|
| 165 |
|
| 166 |
---
|
| 167 |
|