File size: 7,068 Bytes
66fffa4
 
 
 
 
 
45b8b2f
f448e72
45b8b2f
66fffa4
 
6d17a74
66fffa4
f69e608
 
 
 
 
 
 
 
95ce559
 
f69e608
95ce559
f69e608
95ce559
f69e608
 
 
 
 
ec444f5
95ce559
 
 
 
f69e608
 
95ce559
 
 
 
f69e608
95ce559
 
f69e608
 
 
95ce559
f69e608
 
 
 
 
 
95ce559
f69e608
 
 
 
 
 
 
 
 
95ce559
 
f69e608
 
 
95ce559
 
 
f69e608
 
 
 
 
95ce559
f69e608
95ce559
f69e608
95ce559
 
 
 
 
f69e608
95ce559
f69e608
 
 
95ce559
f69e608
95ce559
f69e608
95ce559
 
 
 
f69e608
 
95ce559
 
 
f69e608
 
 
 
95ce559
 
 
 
 
 
 
 
f69e608
95ce559
f69e608
95ce559
 
 
f69e608
 
 
 
 
95ce559
f69e608
 
 
 
 
 
 
 
95ce559
f69e608
 
 
95ce559
 
 
f69e608
95ce559
f69e608
 
 
95ce559
 
 
 
 
 
 
 
 
 
 
 
f69e608
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
title: RetailMind
emoji: πŸ›οΈ
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.13.0
python_version: "3.10"
hf_transfer: true
app_file: app.py
pinned: false
allow_api: false
---
<div align="center">

# 🧠 RetailMind

### Self-Healing LLM for Store Intelligence

[![CI](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions/workflows/ci.yml/badge.svg)](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue?logo=python&logoColor=white)](https://python.org)
[![Gradio](https://img.shields.io/badge/Gradio-5.x-orange)](https://gradio.app)
[![Live Demo](https://img.shields.io/badge/%F0%9F%A4%97%20Live%20Demo-RetailMind-blue)](https://huggingface.co/spaces/Hodfa71/RetailMind)

**An autonomous e-commerce AI that detects semantic drift in user intent and self-heals its own prompt in real time β€” no human in the loop.**

[**β–Ά Try the live demo**](https://huggingface.co/spaces/Hodfa71/RetailMind)

</div>

---

![RetailMind demo](https://media.githubusercontent.com/media/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/main/demo.gif)

---

## What This Demonstrates

| Skill | Implementation |
|-------|----------------|
| **MLOps / Observability** | Real-time EWMA drift detection with live telemetry chart |
| **RAG / Retrieval** | Hybrid: metadata pre-filter (price, category) + dense semantic re-ranking |
| **Prompt Engineering** | Anti-hallucination grounding; dynamic system prompt injection on drift |
| **Self-Healing Systems** | Autonomous prompt rewriting when intent distribution shifts β€” zero human intervention |
| **LLM Integration** | HF Inference API (Qwen2.5-72B) for fast, grounded product recommendations |
| **Software Engineering** | Type hints, logging, pytest suite, CI/CD, modular architecture |

---

## Architecture

```mermaid
graph LR
    A["πŸ›’ User Query"] --> B["πŸ“Š Drift Detector<br/><i>EWMA Semantic Analysis</i>"]
    A --> C["πŸ” Hybrid Retriever<br/><i>Price Filter + Dense Search</i>"]
    B --> D["πŸ”§ Self-Healing Adapter<br/><i>Dynamic Prompt Mutation</i>"]
    C --> E["πŸ€– LLM<br/><i>Qwen2.5-72B via HF API</i>"]
    D --> E
    E --> F["πŸ’¬ Grounded Response"]
    B --> G["πŸ“ˆ Telemetry Dashboard<br/><i>Live EWMA Charts</i>"]
```

```
RetailMind/
β”œβ”€β”€ app.py                    # Gradio UI β€” 3-panel dashboard
β”œβ”€β”€ modules/
β”‚   β”œβ”€β”€ shared.py             # Shared SentenceTransformer singleton
β”‚   β”œβ”€β”€ data_simulation.py    # Curated product catalog with rich metadata
β”‚   β”œβ”€β”€ retrieval.py          # Hybrid retriever (price-filter β†’ semantic re-rank)
β”‚   β”œβ”€β”€ drift.py              # EWMA-based semantic drift detector
β”‚   β”œβ”€β”€ adaptation.py         # Self-healing prompt adapter
β”‚   └── llm.py                # HF Inference API client
β”œβ”€β”€ tests/                    # pytest suite
β”œβ”€β”€ .github/workflows/ci.yml  # CI pipeline (Python 3.10–3.12)
└── requirements.txt
```

---

## How the Self-Healing Loop Works

The system monitors **semantic similarity** between incoming queries and concept anchors using an **Exponentially Weighted Moving Average (EWMA)**. When a concept's EWMA score crosses a threshold, the system rewrites its own instructions β€” instantly and autonomously.

| Concept | Example Triggers | What Changes |
|---------|-----------------|--------------|
| πŸ’° Price Sensitive | *"cheapest", "under $30", "budget"* | Prioritise lowest-price items, highlight savings |
| β˜€οΈ Summer Shift | *"beach", "UV", "hot weather"* | Surface breathable/outdoor products |
| 🌿 Eco Trend | *"sustainable", "recycled", "organic"* | Lead with eco-credentials and certifications |

**Key insight:** Matching is semantic, not keyword-based. *"I care about the planet"* triggers the eco adaptation even though it contains no eco keywords β€” because it's semantically close to the concept anchor embedding.

---

## Hybrid Retrieval

Pure semantic search fails on structured queries like *"bags under $25"* β€” a $200 bag and a $20 bag may be equally relevant semantically. RetailMind solves this with a two-stage pipeline:

1. **NLU extraction** β€” regex parses price ceilings (`"under $50"`, `"budget of $30"`, `"cheapest"`)
2. **Category detection** β€” maps query terms to catalog categories
3. **Pre-filter** β€” removes violating products before any embedding work
4. **Semantic re-rank** β€” cosine similarity on `all-MiniLM-L6-v2` embeddings ranks survivors

```python
# "eco-friendly bag under $30"
# β†’ price_cap=30, category="eco-friendly"
# β†’ 68 products β†’ 6 candidates β†’ top 4 by semantic similarity
```

---

## Demo Walkthrough

Run through the four scenario phases in order:

1. **Phase 1 β€” Normal** &nbsp; General product questions. System responds in balanced mode.
2. **Phase 2 β€” Black Friday** &nbsp; Budget queries. Watch the gold drift line spike above the threshold. Price-prioritisation rules auto-inject.
3. **Phase 3 β€” Summer Shift** &nbsp; Summer queries. Cyan line rises; system pivots to warm-weather products without being told.
4. **Phase 4 β€” Eco Trend** &nbsp; Sustainability queries. Green line triggers; system starts citing certifications and materials.

The telemetry panel shows exactly what's happening: which drift was detected, what prompt rules were injected, and why.

---

## Quick Start

```bash
git clone https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence.git
cd -RetailMind-Self-Healing-LLM-for-Store-Intelligence
pip install -r requirements.txt
HF_TOKEN=your_token python app.py
```

```bash
pytest tests/ -v
```

---

## Tech Stack

| Component | Technology |
|-----------|-----------|
| UI | Gradio 5.x |
| LLM | Qwen2.5-72B-Instruct via HF Inference API |
| Embeddings | SentenceTransformers Β· all-MiniLM-L6-v2 |
| Retrieval | Hybrid (NumPy cosine + metadata pre-filter) |
| Drift Detection | EWMA over sentence embeddings |
| Charting | Plotly |
| Testing | pytest |
| CI/CD | GitHub Actions |
| Language | Python 3.10+ |

---

## Key Design Decisions

| Decision | Rationale |
|----------|-----------|
| **EWMA over raw scores** | Single-query similarity is noisy. EWMA smooths the signal so the system doesn't flip modes on every query. Ξ±=0.35 balances reactivity with stability. |
| **Hybrid retrieval over pure semantic** | Semantic search alone can't enforce price constraints. Pre-filtering handles hard constraints before the expensive embedding step. |
| **Prompt injection over fine-tuning** | Dynamic prompt injection achieves the same behavioural shift as fine-tuning with zero training cost and instant reversibility. |
| **Shared embedding singleton** | Both the retriever and drift detector share one `SentenceTransformer` instance, and the query is encoded once per request β€” eliminating redundant computation. |

---

<div align="center">
<sub>Built by <a href="https://github.com/hodfa840">hodfa840</a> Β· LinkΓΆping University</sub>
</div>