hodfa840 commited on
Commit
95ce559
·
1 Parent(s): a8820b1

docs: update README for recruiters — demo link, correct tech stack, update tech stack and links

Browse files
Files changed (1) hide show
  1. README.md +71 -112
README.md CHANGED
@@ -19,190 +19,149 @@ allow_api: false
19
 
20
  [![CI](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions/workflows/ci.yml/badge.svg)](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions)
21
  [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue?logo=python&logoColor=white)](https://python.org)
22
- [![Gradio](https://img.shields.io/badge/Gradio-4.0%2B-orange?logo=gradio)](https://gradio.app)
23
- [![Hugging Face Space](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-blue)](https://huggingface.co/spaces/Hodfa71/RetailMind)
24
- [![Live Demo](https://img.shields.io/badge/Demo-Direct%20Link-green)](https://hodfa71-retailmind.hf.space/)
25
 
26
- **An autonomous e-commerce AI that detects semantic drift in user intent and self-heals its own behavior in real time — no human in the loop.**
27
 
28
- [Hugging Face Space](https://huggingface.co/spaces/Hodfa71/RetailMind) · [Direct App URL](https://hodfa71-retailmind.hf.space/) · [Architecture](#-architecture) · [How It Works](#-how-the-self-healing-loop-works)
29
 
30
  </div>
31
 
32
  ---
33
 
34
- ## 🎯 What This Project Demonstrates
 
 
 
 
 
 
35
 
36
  | Skill | Implementation |
37
- |-------|---------------|
38
- | **MLOps / Observability** | Real-time EWMA-based drift detection with live telemetry dashboard |
39
- | **RAG / Information Retrieval** | Hybrid retrieval: metadata pre-filtering (price, category) + dense semantic re-ranking |
40
- | **Prompt Engineering** | Anti-hallucination grounding, dynamic prompt injection based on detected drift |
41
  | **Self-Healing Systems** | Autonomous prompt rewriting when intent distribution shifts — zero human intervention |
42
- | **LLM Integration** | Local Qwen2.5-0.5B inference on CPU — no API keys, no GPU, fully offline-capable |
43
- | **Software Engineering** | Type hints, docstrings, logging, pytest suite, CI/CD, modular architecture |
44
 
45
  ---
46
 
47
- ## Architecture
48
 
49
  ```mermaid
50
  graph LR
51
  A["🛒 User Query"] --> B["📊 Drift Detector<br/><i>EWMA Semantic Analysis</i>"]
52
  A --> C["🔍 Hybrid Retriever<br/><i>Price Filter + Dense Search</i>"]
53
  B --> D["🔧 Self-Healing Adapter<br/><i>Dynamic Prompt Mutation</i>"]
54
- C --> E["🤖 Local LLM<br/><i>Qwen2.5-0.5B · CPU</i>"]
55
  D --> E
56
  E --> F["💬 Grounded Response"]
57
  B --> G["📈 Telemetry Dashboard<br/><i>Live EWMA Charts</i>"]
58
  ```
59
 
60
- ### Module Breakdown
61
-
62
  ```
63
  RetailMind/
64
  ├── app.py # Gradio UI — 3-panel dashboard
65
  ├── modules/
66
- │ ├── data_simulation.py # 200 curated products with rich metadata
 
67
  │ ├── retrieval.py # Hybrid retriever (price-filter → semantic re-rank)
68
  │ ├── drift.py # EWMA-based semantic drift detector
69
  │ ├── adaptation.py # Self-healing prompt adapter
70
- │ └── llm.py # Local Qwen2.5-0.5B inference engine
71
- ├── tests/ # pytest suite (catalog, retrieval, drift, adaptation)
72
- ├── .github/workflows/ci.yml # CI pipeline (lint + test on Python 3.10–3.12)
73
  └── requirements.txt
74
  ```
75
 
76
  ---
77
 
78
- ## 🔄 How the Self-Healing Loop Works
79
-
80
- The system continuously monitors the **semantic similarity** between incoming queries and predefined concept anchors using an **Exponentially Weighted Moving Average (EWMA)**.
81
 
82
- ```
83
- Normal Mode Drift Detected!
84
- ┌──────────┐ ┌──────────────┐
85
- User asks about │ Balanced │ EWMA crosses 0.38 → │ Auto-Inject │
86
- random products → │ Prompt │ ──────────────────────── │ New Rules │
87
- └──────────┘ └──────────────┘
88
-
89
- ┌──────────┐ ▼
90
- │ LLM now │ ◄─── Prompt mutated to prioritize
91
- │ focuses │ price / season / sustainability
92
- │ on drift │ based on detected pattern
93
- └──────────┘
94
- ```
95
-
96
- ### Concept Anchors
97
 
98
- | Concept | Trigger Keywords | Adaptation |
99
- |---------|-----------------|------------|
100
- | 💰 **Price Sensitive** | cheap, budget, under $X, deal | Prioritize lowest-price items, highlight savings |
101
- | ☀️ **Summer Shift** | beach, lightweight, UV, hot weather | Surface breathable/outdoor products, suppress winter |
102
- | 🌿 **Eco Trend** | sustainable, recycled, organic, plant-based | Lead with eco-credentials, cite certifications |
103
 
104
- **Key insight:** The system doesn't just match keywords — it uses **semantic similarity** via sentence embeddings. So even a query like *"I care about the planet"* (no eco keywords) will still trigger the eco adaptation because it's semantically close to the concept anchor.
105
 
106
  ---
107
 
108
- ## 🔍 Hybrid Retrieval Deep Dive
109
 
110
- Traditional RAG uses pure semantic similarity, which fails on structured queries like *"bags under $25"*. RetailMind combines:
111
 
112
- 1. **Price Extraction** — Regex-based NLU parses price ceilings from natural language (`"under $50"`, `"budget of $30"`, `"cheapest"`)
113
- 2. **Category Detection** — Maps query terms to catalog categories (`"eco-friendly"` → eco, `"gym"` → sports)
114
- 3. **Pre-Filtering** — Removes products that violate hard constraints *before* embedding search
115
- 4. **Semantic Re-Ranking** — Cosine similarity on SentenceTransformer embeddings ranks survivors
116
 
117
  ```python
118
- # Example: "eco-friendly bag under $30"
119
- # Step 1: price_cap = 30.0
120
- # Step 2: category = "eco-friendly"
121
- # Step 3: 200 products → ~8 candidates (eco + under $30)
122
- # Step 4: Rank 8 candidates by semantic similarity → top 4
123
  ```
124
 
125
  ---
126
 
127
- ## 🚀 Quick Start
 
 
 
 
 
 
 
128
 
129
- ### Prerequisites
130
- - Python 3.10+
131
- - ~2 GB disk space (for model weights on first run)
132
 
133
- ### Installation
 
 
134
 
135
  ```bash
136
  git clone https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence.git
137
  cd -RetailMind-Self-Healing-LLM-for-Store-Intelligence
138
  pip install -r requirements.txt
 
139
  ```
140
 
141
- ### Run
142
-
143
- ```bash
144
- python app.py
145
- ```
146
-
147
- The app launches at `http://localhost:7860` with a public share link.
148
-
149
- ### Run Tests
150
-
151
  ```bash
152
- pip install pytest
153
  pytest tests/ -v
154
  ```
155
 
156
  ---
157
 
158
- ## 🧪 Demo Walkthrough
159
-
160
- To see the self-healing system in action:
161
-
162
- 1. **Phase 1 (Normal)** — Ask general product questions. The system responds in balanced mode.
163
- 2. **Phase 2 (Black Friday)** — Click budget-oriented queries. Watch the drift chart's gold line spike above the threshold. The system auto-injects price-prioritization rules.
164
- 3. **Phase 3 (Summer)** — Switch to summer queries. The cyan line rises, and the system pivots to warm-weather products — *without being told to*.
165
- 4. **Phase 4 (Eco)** — Ask about sustainability. The green line triggers, and the system starts citing certifications and materials.
166
-
167
- > The telemetry panel on the right shows exactly what's happening under the hood — which drift was detected, what prompt rules were injected, and why.
168
-
169
- ---
170
-
171
- ## 🧭 Technical Decisions
172
-
173
- | Decision | Rationale |
174
- |----------|-----------|
175
- | **Qwen2.5-0.5B on CPU** | Eliminates API dependency, runs on any machine, no token needed. Trades quality for reliability — acceptable since grounding handles accuracy. |
176
- | **EWMA over raw scores** | Single-query similarity is noisy. EWMA smooths the signal so the system doesn't flip between modes on every query. α=0.35 balances reactivity with stability. |
177
- | **Hybrid retrieval over pure semantic** | Semantic search alone can't handle price constraints. A $200 jacket and a $20 hat may both be semantically relevant to "winter gear under $25" — only the pre-filter catches this. |
178
- | **SentenceTransformers (all-MiniLM-L6-v2)** | 80MB model, runs on CPU in <50ms per query. Good enough for 200-product catalog. Would swap to a larger model for production scale. |
179
- | **200 curated products over 1,500 generated** | Quality embeddings require quality descriptions. 200 hand-authored products with unique specs outperform 1,500 template-generated items where retrieval can't distinguish between them. |
180
- | **Prompt injection over fine-tuning** | Fine-tuning a 0.5B model per drift state is impractical. Dynamic prompt injection achieves the same behavioral shift with zero training cost and instant reversibility. |
181
-
182
- ---
183
-
184
- ## 🔮 Future Roadmap
185
-
186
- - [ ] **Multi-turn memory** — Track user preferences across conversation turns
187
- - [ ] **A/B testing framework** — Compare adapted vs. baseline responses
188
- - [ ] **Drift alerting** — Webhook notifications when drift exceeds critical thresholds
189
- - [ ] **Vector database** — Migrate from in-memory NumPy to FAISS/Qdrant for scale
190
- - [ ] **User feedback loop** — Incorporate thumbs-up/down into drift calibration
191
-
192
- ---
193
-
194
- ## 🛠️ Tech Stack
195
 
196
  | Component | Technology |
197
  |-----------|-----------|
198
- | UI Framework | Gradio 4.x |
199
- | LLM | Qwen/Qwen2.5-0.5B-Instruct (local, CPU) |
200
- | Embeddings | SentenceTransformers (all-MiniLM-L6-v2) |
201
  | Retrieval | Hybrid (NumPy cosine + metadata pre-filter) |
 
202
  | Charting | Plotly |
203
  | Testing | pytest |
204
  | CI/CD | GitHub Actions |
205
- | Language | Python 3.10+ with type hints |
 
 
 
 
 
 
 
 
 
 
 
206
 
207
  ---
208
 
 
19
 
20
  [![CI](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions/workflows/ci.yml/badge.svg)](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions)
21
  [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue?logo=python&logoColor=white)](https://python.org)
22
+ [![Gradio](https://img.shields.io/badge/Gradio-5.x-orange)](https://gradio.app)
23
+ [![Live Demo](https://img.shields.io/badge/%F0%9F%A4%97%20Live%20Demo-RetailMind-blue)](https://huggingface.co/spaces/Hodfa71/RetailMind)
 
24
 
25
+ **An autonomous e-commerce AI that detects semantic drift in user intent and self-heals its own prompt in real time — no human in the loop.**
26
 
27
+ [**▶ Try the live demo**](https://huggingface.co/spaces/Hodfa71/RetailMind)
28
 
29
  </div>
30
 
31
  ---
32
 
33
+ <!-- Replace the line below with your recorded GIF -->
34
+ <!-- To record: open the demo, use LICEcap / ScreenToGif / Kap, run through Phase 1→4, save as demo.gif, then: git add demo.gif && git commit -m "add demo gif" && git push -->
35
+ ![RetailMind demo](demo.gif)
36
+
37
+ ---
38
+
39
+ ## What This Demonstrates
40
 
41
  | Skill | Implementation |
42
+ |-------|----------------|
43
+ | **MLOps / Observability** | Real-time EWMA drift detection with live telemetry chart |
44
+ | **RAG / Retrieval** | Hybrid: metadata pre-filter (price, category) + dense semantic re-ranking |
45
+ | **Prompt Engineering** | Anti-hallucination grounding; dynamic system prompt injection on drift |
46
  | **Self-Healing Systems** | Autonomous prompt rewriting when intent distribution shifts — zero human intervention |
47
+ | **LLM Integration** | HF Inference API (Qwen2.5-72B) for fast, grounded product recommendations |
48
+ | **Software Engineering** | Type hints, logging, pytest suite, CI/CD, modular architecture |
49
 
50
  ---
51
 
52
+ ## Architecture
53
 
54
  ```mermaid
55
  graph LR
56
  A["🛒 User Query"] --> B["📊 Drift Detector<br/><i>EWMA Semantic Analysis</i>"]
57
  A --> C["🔍 Hybrid Retriever<br/><i>Price Filter + Dense Search</i>"]
58
  B --> D["🔧 Self-Healing Adapter<br/><i>Dynamic Prompt Mutation</i>"]
59
+ C --> E["🤖 LLM<br/><i>Qwen2.5-72B via HF API</i>"]
60
  D --> E
61
  E --> F["💬 Grounded Response"]
62
  B --> G["📈 Telemetry Dashboard<br/><i>Live EWMA Charts</i>"]
63
  ```
64
 
 
 
65
  ```
66
  RetailMind/
67
  ├── app.py # Gradio UI — 3-panel dashboard
68
  ├── modules/
69
+ │ ├── shared.py # Shared SentenceTransformer singleton
70
+ │ ├── data_simulation.py # Curated product catalog with rich metadata
71
  │ ├── retrieval.py # Hybrid retriever (price-filter → semantic re-rank)
72
  │ ├── drift.py # EWMA-based semantic drift detector
73
  │ ├── adaptation.py # Self-healing prompt adapter
74
+ │ └── llm.py # HF Inference API client
75
+ ├── tests/ # pytest suite
76
+ ├── .github/workflows/ci.yml # CI pipeline (Python 3.10–3.12)
77
  └── requirements.txt
78
  ```
79
 
80
  ---
81
 
82
+ ## How the Self-Healing Loop Works
 
 
83
 
84
+ The system monitors **semantic similarity** between incoming queries and concept anchors using an **Exponentially Weighted Moving Average (EWMA)**. When a concept's EWMA score crosses a threshold, the system rewrites its own instructions — instantly and autonomously.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
+ | Concept | Example Triggers | What Changes |
87
+ |---------|-----------------|--------------|
88
+ | 💰 Price Sensitive | *"cheapest", "under $30", "budget"* | Prioritise lowest-price items, highlight savings |
89
+ | ☀️ Summer Shift | *"beach", "UV", "hot weather"* | Surface breathable/outdoor products |
90
+ | 🌿 Eco Trend | *"sustainable", "recycled", "organic"* | Lead with eco-credentials and certifications |
91
 
92
+ **Key insight:** Matching is semantic, not keyword-based. *"I care about the planet"* triggers the eco adaptation even though it contains no eco keywords because it's semantically close to the concept anchor embedding.
93
 
94
  ---
95
 
96
+ ## Hybrid Retrieval
97
 
98
+ Pure semantic search fails on structured queries like *"bags under $25"* — a $200 bag and a $20 bag may be equally relevant semantically. RetailMind solves this with a two-stage pipeline:
99
 
100
+ 1. **NLU extraction** — regex parses price ceilings (`"under $50"`, `"budget of $30"`, `"cheapest"`)
101
+ 2. **Category detection** — maps query terms to catalog categories
102
+ 3. **Pre-filter** — removes violating products before any embedding work
103
+ 4. **Semantic re-rank** — cosine similarity on `all-MiniLM-L6-v2` embeddings ranks survivors
104
 
105
  ```python
106
+ # "eco-friendly bag under $30"
107
+ # price_cap=30, category="eco-friendly"
108
+ # 68 products 6 candidates → top 4 by semantic similarity
 
 
109
  ```
110
 
111
  ---
112
 
113
+ ## Demo Walkthrough
114
+
115
+ Run through the four scenario phases in order:
116
+
117
+ 1. **Phase 1 — Normal** &nbsp; General product questions. System responds in balanced mode.
118
+ 2. **Phase 2 — Black Friday** &nbsp; Budget queries. Watch the gold drift line spike above the threshold. Price-prioritisation rules auto-inject.
119
+ 3. **Phase 3 — Summer Shift** &nbsp; Summer queries. Cyan line rises; system pivots to warm-weather products without being told.
120
+ 4. **Phase 4 — Eco Trend** &nbsp; Sustainability queries. Green line triggers; system starts citing certifications and materials.
121
 
122
+ The telemetry panel shows exactly what's happening: which drift was detected, what prompt rules were injected, and why.
 
 
123
 
124
+ ---
125
+
126
+ ## Quick Start
127
 
128
  ```bash
129
  git clone https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence.git
130
  cd -RetailMind-Self-Healing-LLM-for-Store-Intelligence
131
  pip install -r requirements.txt
132
+ HF_TOKEN=your_token python app.py
133
  ```
134
 
 
 
 
 
 
 
 
 
 
 
135
  ```bash
 
136
  pytest tests/ -v
137
  ```
138
 
139
  ---
140
 
141
+ ## Tech Stack
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
142
 
143
  | Component | Technology |
144
  |-----------|-----------|
145
+ | UI | Gradio 5.x |
146
+ | LLM | Qwen2.5-72B-Instruct via HF Inference API |
147
+ | Embeddings | SentenceTransformers · all-MiniLM-L6-v2 |
148
  | Retrieval | Hybrid (NumPy cosine + metadata pre-filter) |
149
+ | Drift Detection | EWMA over sentence embeddings |
150
  | Charting | Plotly |
151
  | Testing | pytest |
152
  | CI/CD | GitHub Actions |
153
+ | Language | Python 3.10+ |
154
+
155
+ ---
156
+
157
+ ## Key Design Decisions
158
+
159
+ | Decision | Rationale |
160
+ |----------|-----------|
161
+ | **EWMA over raw scores** | Single-query similarity is noisy. EWMA smooths the signal so the system doesn't flip modes on every query. α=0.35 balances reactivity with stability. |
162
+ | **Hybrid retrieval over pure semantic** | Semantic search alone can't enforce price constraints. Pre-filtering handles hard constraints before the expensive embedding step. |
163
+ | **Prompt injection over fine-tuning** | Dynamic prompt injection achieves the same behavioural shift as fine-tuning with zero training cost and instant reversibility. |
164
+ | **Shared embedding singleton** | Both the retriever and drift detector share one `SentenceTransformer` instance, and the query is encoded once per request — eliminating redundant computation. |
165
 
166
  ---
167