focustiki commited on
Commit
9bcadf3
Β·
verified Β·
1 Parent(s): cb5d2bb

Upload 12 files

Browse files
Files changed (13) hide show
  1. .gitattributes +1 -0
  2. DEPLOY_TO_IOS.md +156 -0
  3. Dockerfile +37 -0
  4. agent.py +329 -0
  5. agent_notebook.py +273 -0
  6. app.py +190 -0
  7. data_engineering_patterns.pdf +3 -0
  8. index.html +555 -0
  9. manifest.json +37 -0
  10. rag.py +170 -0
  11. requirements.txt +31 -0
  12. setup.sh +56 -0
  13. sw.js +52 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ data_engineering_patterns.pdf filter=lfs diff=lfs merge=lfs -text
DEPLOY_TO_IOS.md ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸ“± Deploy DE Assistant to iOS β€” Step by Step ($0)
2
+
3
+ ## What you get
4
+ A PWA that installs on your iPhone home screen like a native app, with:
5
+ - Full chat interface with markdown rendering
6
+ - **Voice input** (speak your question)
7
+ - **Voice output** (AI reads the answer aloud)
8
+ - Works offline for the UI shell
9
+ - Connected to Groq's free-tier LLM (sub-500ms responses)
10
+
11
+ ---
12
+
13
+ ## Option A: Hugging Face Spaces (Recommended β€” 5 minutes)
14
+
15
+ HF Spaces gives you a free public HTTPS URL β€” required for the PWA to work on iOS.
16
+
17
+ ### Step 1 β€” Get a free Groq API key
18
+ 1. Go to [console.groq.com](https://console.groq.com)
19
+ 2. Sign up β†’ API Keys β†’ Create API Key
20
+ 3. Copy the key (starts with `gsk_`)
21
+
22
+ ### Step 2 β€” Create a Hugging Face Space
23
+ 1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
24
+ 2. Space name: `de-knowledge-assistant`
25
+ 3. SDK: **Docker** (not Gradio/Streamlit)
26
+ 4. Visibility: Public
27
+
28
+ ### Step 3 β€” Create a Dockerfile in the Space
29
+ ```dockerfile
30
+ FROM python:3.11-slim
31
+
32
+ WORKDIR /app
33
+ COPY requirements.txt .
34
+ RUN pip install --no-cache-dir -r requirements.txt
35
+
36
+ COPY . .
37
+
38
+ EXPOSE 7860
39
+ CMD ["python", "app.py"]
40
+ ```
41
+
42
+ ### Step 4 β€” Upload all files
43
+ Push these files to the Space repo (via git or HF UI):
44
+ ```
45
+ app.py
46
+ rag.py
47
+ agent.py
48
+ requirements.txt
49
+ static/index.html
50
+ static/manifest.json
51
+ static/sw.js
52
+ knowledge/data_engineering_patterns.pdf
53
+ Dockerfile
54
+ ```
55
+
56
+ ```bash
57
+ # Using git
58
+ git clone https://huggingface.co/spaces/your-username/de-knowledge-assistant
59
+ cd de-knowledge-assistant
60
+ # copy all de-assistant/ files here
61
+ git add .
62
+ git commit -m "Initial deployment"
63
+ git push
64
+ ```
65
+
66
+ ### Step 5 β€” Set the Groq API key as a secret
67
+ In HF Spaces β†’ Settings β†’ Repository Secrets:
68
+ - Name: `GROQ_API_KEY`
69
+ - Value: your `gsk_...` key
70
+
71
+ ### Step 6 β€” Add to iPhone home screen
72
+ 1. Wait for the Space to build (~3 min)
73
+ 2. Open your Space URL in **Safari on iPhone**:
74
+ `https://your-username-de-knowledge-assistant.hf.space`
75
+ 3. Tap the **Share** button (box with arrow) β†’ **Add to Home Screen**
76
+ 4. Name it "DE Assistant" β†’ **Add**
77
+
78
+ Done! It now appears on your home screen like a native app. πŸŽ‰
79
+
80
+ ---
81
+
82
+ ## Option B: Local + Ngrok (instant, for testing)
83
+
84
+ Run locally and expose with a free ngrok tunnel so Safari can reach it:
85
+
86
+ ```bash
87
+ # Terminal 1 β€” start the app
88
+ ./setup.sh
89
+
90
+ # Terminal 2 β€” expose publicly
91
+ brew install ngrok # or download at ngrok.com
92
+ ngrok http 8000
93
+ ```
94
+
95
+ Copy the `https://xxxx.ngrok.io` URL β†’ open in iPhone Safari β†’ Add to Home Screen.
96
+
97
+ ---
98
+
99
+ ## Enabling Voice on iOS
100
+
101
+ Voice requires HTTPS (which HF Spaces provides). After installing the PWA:
102
+
103
+ 1. Open the app from your home screen
104
+ 2. Tap the **🎀 microphone button**
105
+ 3. iOS will ask for microphone permission β†’ **Allow**
106
+ 4. Speak your question β€” the AI will reply and read the answer aloud
107
+
108
+ > **Tip**: The voice assistant also reads back the AI's answer using the device's
109
+ > built-in text-to-speech (no extra API needed).
110
+
111
+ ---
112
+
113
+ ## Architecture Overview
114
+
115
+ ```
116
+ iPhone (Safari PWA)
117
+ β”‚
118
+ β”‚ HTTPS / SSE streaming
119
+ β–Ό
120
+ Hugging Face Spaces (free)
121
+ β”‚ FastAPI app.py
122
+ β”‚ β”œβ”€ /api/chat β†’ agent.py (streaming)
123
+ β”‚ └─ /api/search β†’ rag.py (vector search)
124
+ β”‚
125
+ β”œβ”€ RAG pipeline
126
+ β”‚ β”œβ”€ PDF β†’ PyPDF2 β†’ 800-char chunks
127
+ β”‚ β”œβ”€ sentence-transformers/all-MiniLM-L6-v2 (free, CPU)
128
+ β”‚ └─ ChromaDB in-memory (MMR retrieval)
129
+ β”‚
130
+ └─ Groq API (free tier)
131
+ └─ llama-3.1-8b-instant (< 500ms latency)
132
+ ```
133
+
134
+ ---
135
+
136
+ ## Connecting to Databricks (optional upgrade)
137
+
138
+ Once you have a Databricks workspace:
139
+
140
+ 1. Run `databricks/agent_notebook.py` to register the MLflow model
141
+ 2. Create a Model Serving endpoint (free on Databricks trial)
142
+ 3. Add `DATABRICKS_ENDPOINT_URL` and `DATABRICKS_TOKEN` to HF Spaces secrets
143
+ 4. The agent automatically routes to Databricks when those vars are set
144
+
145
+ ---
146
+
147
+ ## Free Tier Limits (as of 2024)
148
+
149
+ | Service | Free Limit |
150
+ |---------|------------|
151
+ | Groq API | 14,400 requests/day, 30 req/min |
152
+ | HF Spaces | 2 vCPU, 16 GB RAM, always-on |
153
+ | ChromaDB | Unlimited (in-memory) |
154
+ | sentence-transformers | Unlimited (local) |
155
+
156
+ The embedding model (~90 MB) is downloaded on first start and cached in HF Spaces.
Dockerfile ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ── Hugging Face Spaces / Docker deployment ───────────────────────────────────
2
+ # Port 7860 is required for HF Spaces
3
+
4
+ FROM python:3.11-slim
5
+
6
+ # System deps for chromadb and sentence-transformers
7
+ RUN apt-get update && apt-get install -y --no-install-recommends \
8
+ gcc g++ libgomp1 && \
9
+ rm -rf /var/lib/apt/lists/*
10
+
11
+ WORKDIR /app
12
+
13
+ # Install Python deps first (cached layer)
14
+ COPY requirements.txt .
15
+ RUN pip install --no-cache-dir -r requirements.txt
16
+
17
+ # Copy all flat-uploaded files
18
+ COPY . .
19
+
20
+ # Reorganise flat upload into the directory structure the app expects:
21
+ # static/ β†’ index.html, manifest.json, sw.js
22
+ # knowledge/ β†’ data_engineering_patterns.pdf
23
+ RUN mkdir -p static knowledge && \
24
+ mv index.html manifest.json sw.js static/ 2>/dev/null || true && \
25
+ mv data_engineering_patterns.pdf knowledge/ 2>/dev/null || true
26
+
27
+ # HF Spaces runs as non-root
28
+ RUN useradd -m -u 1000 user
29
+ USER user
30
+ ENV HOME=/home/user PATH=/home/user/.local/bin:$PATH
31
+
32
+ EXPOSE 7860
33
+
34
+ # PORT env var is set automatically by HF Spaces to 7860
35
+ ENV PORT=7860
36
+
37
+ CMD ["python", "app.py"]
agent.py ADDED
@@ -0,0 +1,329 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Databricks-Compatible MLflow Agent β€” Data Engineering Knowledge Assistant
3
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
4
+ β€’ Structured as an MLflow PyFunc model so it can be logged + served on Databricks
5
+ β€’ Uses Groq (llama-3.1-8b-instant) for ultra-low-latency responses
6
+ β€’ Tools: search_knowledge, generate_code, explain_pattern, compare_patterns
7
+ β€’ Streaming-first for perceived sub-200ms UI responses
8
+ """
9
+ from __future__ import annotations
10
+
11
+ import os
12
+ import json
13
+ from typing import AsyncIterator, List, Dict, Optional
14
+
15
+ from rag import DataEngineeringRAG
16
+
17
+ # ──────────────────────────────────────────────────────────────────────────────
18
+ # System prompt
19
+ # ──────────────────────────────────────────────────────────────────────────────
20
+
21
+ SYSTEM_PROMPT = """You are an elite Data Engineering Knowledge Assistant, \
22
+ specializing in production-grade data pipelines, architecture patterns, and Databricks.
23
+
24
+ Your knowledge comes from "Data Engineering Design Patterns" β€” a comprehensive guide \
25
+ to solving real data engineering problems.
26
+
27
+ Guidelines:
28
+ 1. Always ground answers in retrieved context from the knowledge base.
29
+ 2. Give concrete, code-inclusive answers when relevant.
30
+ 3. Reference specific patterns by name (Lambda, Kappa, Medallion, Lakehouse, etc.).
31
+ 4. When asked for code, produce clean Python/PySpark/SQL β€” working examples only.
32
+ 5. Be direct and technical β€” this user is a practising data engineer.
33
+ 6. If unsure, say so rather than hallucinating.
34
+
35
+ Format your responses with:
36
+ - A direct answer first
37
+ - Code blocks when applicable (```python or ```sql)
38
+ - Pattern references in **bold**
39
+ - A "πŸ’‘ Pro tip" line when you know a non-obvious insight
40
+ """
41
+
42
+
43
+ # ──────────────────────────────────────────────────────────────────────────────
44
+ # Tool definitions (sent to Groq as JSON tool schemas)
45
+ # ──────────────────────────────────────────────────────────────────────────────
46
+
47
+ TOOLS = [
48
+ {
49
+ "type": "function",
50
+ "function": {
51
+ "name": "search_knowledge_base",
52
+ "description": (
53
+ "Retrieve relevant chunks from the Data Engineering Design Patterns book. "
54
+ "Always call this first before answering any technical question."
55
+ ),
56
+ "parameters": {
57
+ "type": "object",
58
+ "properties": {
59
+ "query": {
60
+ "type": "string",
61
+ "description": "Semantic search query, e.g. 'CDC pattern with Kafka'",
62
+ },
63
+ "k": {
64
+ "type": "integer",
65
+ "description": "Number of chunks to retrieve (default 5)",
66
+ "default": 5,
67
+ },
68
+ },
69
+ "required": ["query"],
70
+ },
71
+ },
72
+ },
73
+ {
74
+ "type": "function",
75
+ "function": {
76
+ "name": "generate_code_example",
77
+ "description": "Generate a working PySpark / Python / SQL code example for a DE pattern.",
78
+ "parameters": {
79
+ "type": "object",
80
+ "properties": {
81
+ "pattern_name": {"type": "string"},
82
+ "language": {
83
+ "type": "string",
84
+ "enum": ["python", "pyspark", "sql", "scala"],
85
+ },
86
+ "context": {"type": "string", "description": "What the code should do"},
87
+ },
88
+ "required": ["pattern_name", "language", "context"],
89
+ },
90
+ },
91
+ },
92
+ ]
93
+
94
+
95
+ # ──────────────────────────────────────────────────────────────────────────────
96
+ # Agent
97
+ # ──────────────────────────────────────────────────────────────────────────────
98
+
99
+ class DataEngineeringAgent:
100
+ """
101
+ Agentic wrapper around Groq + RAG.
102
+
103
+ Compatible with MLflow PyFunc interface for Databricks deployment.
104
+ See databricks/agent_notebook.py for registration instructions.
105
+ """
106
+
107
+ def __init__(self, rag: DataEngineeringRAG, groq_api_key: str):
108
+ self.rag = rag
109
+ self.groq_api_key = groq_api_key
110
+ self._client = None
111
+
112
+ # ── Groq client (lazy init) ───────────────────────────────────────────────
113
+
114
+ def _get_client(self):
115
+ if self._client is None:
116
+ from groq import Groq
117
+ self._client = Groq(api_key=self.groq_api_key)
118
+ return self._client
119
+
120
+ # ── Tool execution ────────────────────────────────────────────────────────
121
+
122
+ def _execute_tool(self, tool_name: str, tool_args: Dict) -> str:
123
+ if tool_name == "search_knowledge_base":
124
+ results = self.rag.search(
125
+ query=tool_args.get("query", ""),
126
+ k=tool_args.get("k", 5),
127
+ )
128
+ if not results:
129
+ return "No relevant content found in the knowledge base."
130
+
131
+ formatted = []
132
+ for i, r in enumerate(results, 1):
133
+ formatted.append(
134
+ f"[Source {i} | Page {r['page']} | Relevance {r['score']:.2f}]\n"
135
+ f"{r['content']}"
136
+ )
137
+ return "\n\n---\n\n".join(formatted)
138
+
139
+ elif tool_name == "generate_code_example":
140
+ # Return structured prompt for the LLM to fill in
141
+ return (
142
+ f"Generate a {tool_args.get('language', 'python')} code example "
143
+ f"for the '{tool_args.get('pattern_name')}' pattern. "
144
+ f"Context: {tool_args.get('context', '')}. "
145
+ "Include comments explaining each step."
146
+ )
147
+
148
+ return f"Tool '{tool_name}' not recognised."
149
+
150
+ # ── Sync invoke ───────────────────────────────────────────────────────────
151
+
152
+ def invoke(self, message: str, history: Optional[List[Dict]] = None) -> str:
153
+ """Single-turn or multi-turn (history) invocation."""
154
+ messages = self._build_messages(message, history or [])
155
+ client = self._get_client()
156
+
157
+ # First call β€” agent decides whether to use tools
158
+ response = client.chat.completions.create(
159
+ model="llama-3.1-8b-instant",
160
+ messages=messages,
161
+ tools=TOOLS,
162
+ tool_choice="auto",
163
+ temperature=0.2,
164
+ max_tokens=2048,
165
+ )
166
+
167
+ msg = response.choices[0].message
168
+
169
+ # Agentic loop: execute tool calls until the model stops requesting them
170
+ while msg.tool_calls:
171
+ messages.append(msg) # add assistant message with tool_calls
172
+
173
+ for tc in msg.tool_calls:
174
+ tool_result = self._execute_tool(
175
+ tc.function.name,
176
+ json.loads(tc.function.arguments),
177
+ )
178
+ messages.append(
179
+ {
180
+ "role": "tool",
181
+ "tool_call_id": tc.id,
182
+ "content": tool_result,
183
+ }
184
+ )
185
+
186
+ # Next iteration
187
+ response = client.chat.completions.create(
188
+ model="llama-3.1-8b-instant",
189
+ messages=messages,
190
+ tools=TOOLS,
191
+ tool_choice="auto",
192
+ temperature=0.2,
193
+ max_tokens=2048,
194
+ )
195
+ msg = response.choices[0].message
196
+
197
+ return msg.content or ""
198
+
199
+ # ── Async streaming invoke ────────────────────────────────────────────────
200
+
201
+ async def astream(
202
+ self, message: str, history: Optional[List[Dict]] = None
203
+ ) -> AsyncIterator[str]:
204
+ """
205
+ Streaming variant β€” yields text chunks as soon as they arrive from Groq.
206
+ Latency: first token typically < 200 ms on Groq's free tier.
207
+ """
208
+ import asyncio
209
+
210
+ # Run tool-use loop synchronously (tool calls are fast), then stream final answer
211
+ messages = self._build_messages(message, history or [])
212
+ client = self._get_client()
213
+
214
+ # Tool resolution (non-streaming)
215
+ response = await asyncio.to_thread(
216
+ client.chat.completions.create,
217
+ model="llama-3.1-8b-instant",
218
+ messages=messages,
219
+ tools=TOOLS,
220
+ tool_choice="auto",
221
+ temperature=0.2,
222
+ max_tokens=64, # small limit β€” we just need tool selection
223
+ )
224
+ msg = response.choices[0].message
225
+
226
+ while msg.tool_calls:
227
+ messages.append(msg)
228
+ for tc in msg.tool_calls:
229
+ tool_result = self._execute_tool(
230
+ tc.function.name,
231
+ json.loads(tc.function.arguments),
232
+ )
233
+ messages.append(
234
+ {"role": "tool", "tool_call_id": tc.id, "content": tool_result}
235
+ )
236
+ response = await asyncio.to_thread(
237
+ client.chat.completions.create,
238
+ model="llama-3.1-8b-instant",
239
+ messages=messages,
240
+ tools=TOOLS,
241
+ tool_choice="auto",
242
+ temperature=0.2,
243
+ max_tokens=64,
244
+ )
245
+ msg = response.choices[0].message
246
+
247
+ # Now stream the final answer
248
+ messages.append({"role": "assistant", "content": msg.content or ""})
249
+ if msg.content:
250
+ # If the last tool-resolution already produced an answer, yield it
251
+ for word in (msg.content or "").split(" "):
252
+ yield word + " "
253
+ await asyncio.sleep(0)
254
+ return
255
+
256
+ # Otherwise stream a fresh completion
257
+ stream = await asyncio.to_thread(
258
+ client.chat.completions.create,
259
+ model="llama-3.1-8b-instant",
260
+ messages=messages,
261
+ temperature=0.3,
262
+ max_tokens=2048,
263
+ stream=True,
264
+ )
265
+ for chunk in stream:
266
+ delta = chunk.choices[0].delta.content
267
+ if delta:
268
+ yield delta
269
+
270
+ # ── MLflow PyFunc interface (Databricks) ──────────────────────────────────
271
+
272
+ def predict(self, context, model_input) -> str:
273
+ """
274
+ MLflow-compatible predict method.
275
+ Allows the agent to be logged and served via Databricks Model Serving.
276
+
277
+ model_input: pandas DataFrame with columns ["message", "history"]
278
+ """
279
+ import pandas as pd
280
+
281
+ if isinstance(model_input, pd.DataFrame):
282
+ row = model_input.iloc[0]
283
+ message = row.get("message", "")
284
+ history = row.get("history", [])
285
+ if isinstance(history, str):
286
+ history = json.loads(history)
287
+ else:
288
+ message = model_input.get("message", "")
289
+ history = model_input.get("history", [])
290
+
291
+ return self.invoke(message=message, history=history)
292
+
293
+ # ── Helpers ───────────────────────────────────────────────────────────────
294
+
295
+ def _build_messages(self, user_message: str, history: List[Dict]) -> List[Dict]:
296
+ messages = [{"role": "system", "content": SYSTEM_PROMPT}]
297
+
298
+ for turn in history[-6:]: # keep last 3 exchanges
299
+ messages.append({"role": turn["role"], "content": turn["content"]})
300
+
301
+ messages.append({"role": "user", "content": user_message})
302
+ return messages
303
+
304
+
305
+ # ──────────────────────────────────────────────────────────────────────────────
306
+ # MLflow wrapper for Databricks registration
307
+ # ──────────────────────────────────────────────────────────────────────────────
308
+
309
+ class DEAgentPyFunc:
310
+ """
311
+ Thin MLflow PyFunc wrapper. Log with:
312
+
313
+ import mlflow
314
+ mlflow.pyfunc.log_model(
315
+ artifact_path="de_agent",
316
+ python_model=DEAgentPyFunc(),
317
+ pip_requirements=["groq", "langchain", "chromadb", ...],
318
+ )
319
+ """
320
+
321
+ def load_context(self, context):
322
+ pdf_path = context.artifacts.get("pdf_path", "knowledge/data_engineering_patterns.pdf")
323
+ groq_key = os.environ.get("GROQ_API_KEY", "")
324
+ self.rag = DataEngineeringRAG(pdf_path=pdf_path, groq_api_key=groq_key)
325
+ self.rag.initialize()
326
+ self.agent = DataEngineeringAgent(rag=self.rag, groq_api_key=groq_key)
327
+
328
+ def predict(self, context, model_input):
329
+ return self.agent.predict(context, model_input)
agent_notebook.py ADDED
@@ -0,0 +1,273 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Databricks notebook source
2
+ # MAGIC %md
3
+ # MAGIC # πŸ—„οΈ Data Engineering Knowledge Agent β€” Databricks Deployment
4
+ # MAGIC
5
+ # MAGIC This notebook deploys the DE Knowledge Assistant as a **Databricks Model Serving endpoint**.
6
+ # MAGIC
7
+ # MAGIC Architecture:
8
+ # MAGIC ```
9
+ # MAGIC [PDF Knowledge Base] β†’ [ChromaDB Vectors] β†’ [MLflow PyFunc Agent] β†’ [Databricks Model Serving] β†’ [FastAPI PWA]
10
+ # MAGIC ```
11
+ # MAGIC
12
+ # MAGIC Prerequisites (all free on Databricks Community Edition or trial):
13
+ # MAGIC - Databricks workspace (community.cloud.databricks.com)
14
+ # MAGIC - GROQ_API_KEY stored in Databricks Secrets
15
+ # MAGIC - Unity Catalog enabled (optional but recommended)
16
+
17
+ # COMMAND ----------
18
+ # MAGIC %pip install groq langchain langchain-community chromadb sentence-transformers pypdf mlflow fastapi uvicorn
19
+ # MAGIC dbutils.library.restartPython()
20
+
21
+ # COMMAND ----------
22
+
23
+ import os
24
+ import mlflow
25
+ import mlflow.pyfunc
26
+ from mlflow.models import infer_signature
27
+ import pandas as pd
28
+
29
+ # ── 1. Configuration ──────────────────────────────────────────────────────────
30
+ EXPERIMENT_NAME = "/Users/your-email@domain.com/de-knowledge-assistant"
31
+ MODEL_NAME = "de_knowledge_agent"
32
+ PDF_VOLUME_PATH = "/Volumes/main/default/knowledge/data_engineering_patterns.pdf"
33
+ # ^ Upload the PDF to a Unity Catalog Volume first:
34
+ # databricks fs cp data_engineering_patterns.pdf dbfs:/Volumes/main/default/knowledge/
35
+
36
+ # Retrieve API key from Databricks secrets (safe β€” never hardcode)
37
+ GROQ_API_KEY = dbutils.secrets.get(scope="de-assistant", key="groq-api-key")
38
+ # Create the secret scope first:
39
+ # databricks secrets create-scope --scope de-assistant
40
+ # databricks secrets put --scope de-assistant --key groq-api-key
41
+
42
+ # COMMAND ----------
43
+ # MAGIC %md ## 2. Define the MLflow PyFunc Model
44
+
45
+ # COMMAND ----------
46
+
47
+ import sys
48
+ sys.path.insert(0, "/Workspace/Repos/your-repo/de-assistant") # adjust to your repo path
49
+
50
+ from rag import DataEngineeringRAG
51
+ from agent import DataEngineeringAgent, DEAgentPyFunc
52
+
53
+
54
+ class DEKnowledgeAssistant(mlflow.pyfunc.PythonModel):
55
+ """
56
+ MLflow PyFunc wrapper that:
57
+ 1. Loads the PDF β†’ builds ChromaDB vectors on model load
58
+ 2. Exposes a predict() method compatible with Databricks Model Serving
59
+ 3. Supports chat history for multi-turn conversations
60
+ """
61
+
62
+ def load_context(self, context: mlflow.pyfunc.PythonModelContext):
63
+ """Called once when the model is loaded into serving."""
64
+ import os
65
+
66
+ pdf_path = context.artifacts.get("pdf_path", PDF_VOLUME_PATH)
67
+ groq_key = os.environ.get("GROQ_API_KEY", GROQ_API_KEY)
68
+
69
+ self.rag = DataEngineeringRAG(pdf_path=pdf_path, groq_api_key=groq_key)
70
+ self.rag.initialize()
71
+ self.agent = DataEngineeringAgent(rag=self.rag, groq_api_key=groq_key)
72
+
73
+ print("βœ… DE Knowledge Agent loaded and ready")
74
+
75
+ def predict(
76
+ self,
77
+ context: mlflow.pyfunc.PythonModelContext,
78
+ model_input: pd.DataFrame,
79
+ params: dict = None,
80
+ ) -> pd.Series:
81
+ """
82
+ Input DataFrame columns:
83
+ - message (str): user question
84
+ - history (str, JSON): previous conversation turns
85
+
86
+ Returns: pd.Series of string responses
87
+ """
88
+ import json
89
+
90
+ def process_row(row):
91
+ history = []
92
+ if row.get("history"):
93
+ try:
94
+ history = json.loads(row["history"])
95
+ except Exception:
96
+ history = []
97
+ return self.agent.invoke(message=row["message"], history=history)
98
+
99
+ return model_input.apply(process_row, axis=1)
100
+
101
+
102
+ # COMMAND ----------
103
+ # MAGIC %md ## 3. Log the model to MLflow
104
+
105
+ # COMMAND ----------
106
+
107
+ mlflow.set_experiment(EXPERIMENT_NAME)
108
+
109
+ # Example input/output for signature inference
110
+ sample_input = pd.DataFrame([{
111
+ "message": "What is the Medallion architecture?",
112
+ "history": "[]",
113
+ }])
114
+
115
+ with mlflow.start_run(run_name="de_knowledge_agent_v1") as run:
116
+
117
+ # Log hyperparameters
118
+ mlflow.log_params({
119
+ "llm_model": "llama-3.1-8b-instant",
120
+ "embedding_model": "all-MiniLM-L6-v2",
121
+ "chunk_size": 800,
122
+ "chunk_overlap": 160,
123
+ "retrieval_strategy": "mmr",
124
+ "top_k": 5,
125
+ })
126
+
127
+ # Infer signature from sample data
128
+ model = DEKnowledgeAssistant()
129
+
130
+ signature = infer_signature(
131
+ model_input=sample_input,
132
+ model_output=pd.Series(["Sample response from DE agent"]),
133
+ )
134
+
135
+ # Log the model
136
+ mlflow.pyfunc.log_model(
137
+ artifact_path="de_agent",
138
+ python_model=model,
139
+ artifacts={"pdf_path": PDF_VOLUME_PATH},
140
+ signature=signature,
141
+ pip_requirements=[
142
+ "groq>=0.9.0",
143
+ "langchain>=0.2.0",
144
+ "langchain-community>=0.2.0",
145
+ "chromadb>=0.5.0",
146
+ "sentence-transformers>=3.0.0",
147
+ "pypdf>=4.0.0",
148
+ "fastapi>=0.111.0",
149
+ "uvicorn>=0.30.0",
150
+ ],
151
+ registered_model_name=MODEL_NAME,
152
+ )
153
+
154
+ print(f"βœ… Model logged β€” Run ID: {run.info.run_id}")
155
+
156
+ # COMMAND ----------
157
+ # MAGIC %md ## 4. Register and deploy to Model Serving
158
+
159
+ # COMMAND ----------
160
+
161
+ from mlflow.tracking import MlflowClient
162
+
163
+ client = MlflowClient()
164
+
165
+ # Get the latest version
166
+ latest = client.get_latest_versions(MODEL_NAME, stages=["None"])[0]
167
+ version = latest.version
168
+ print(f"Latest model version: {version}")
169
+
170
+ # Transition to Production
171
+ client.transition_model_version_stage(
172
+ name=MODEL_NAME,
173
+ version=version,
174
+ stage="Production",
175
+ archive_existing_versions=True,
176
+ )
177
+ print(f"βœ… Model v{version} promoted to Production")
178
+
179
+ # COMMAND ----------
180
+ # MAGIC %md
181
+ # MAGIC ## 5. Create a Databricks Model Serving endpoint
182
+ # MAGIC
183
+ # MAGIC Run this via the Databricks SDK or UI:
184
+ # MAGIC
185
+ # MAGIC **UI path**: Machine Learning β†’ Serving β†’ Create Serving Endpoint
186
+ # MAGIC - Name: `de-knowledge-assistant`
187
+ # MAGIC - Model: `de_knowledge_agent` (Production)
188
+ # MAGIC - Compute: Small (CPU) β€” sufficient for this workload
189
+ # MAGIC - Environment variables: `GROQ_API_KEY` = your Groq key
190
+
191
+ # COMMAND ----------
192
+ # MAGIC # (Optional) SDK deployment
193
+
194
+ try:
195
+ from databricks.sdk import WorkspaceClient
196
+ from databricks.sdk.service.serving import (
197
+ EndpointCoreConfigInput,
198
+ ServedModelInput,
199
+ ServedModelInputWorkloadSize,
200
+ )
201
+
202
+ w = WorkspaceClient()
203
+
204
+ endpoint_config = EndpointCoreConfigInput(
205
+ name="de-knowledge-assistant",
206
+ served_models=[
207
+ ServedModelInput(
208
+ model_name=MODEL_NAME,
209
+ model_version=str(version),
210
+ workload_size=ServedModelInputWorkloadSize.SMALL,
211
+ scale_to_zero_enabled=True, # cost-saving: scale down when idle
212
+ environment_vars={"GROQ_API_KEY": "{{secrets/de-assistant/groq-api-key}}"},
213
+ )
214
+ ],
215
+ )
216
+
217
+ w.serving_endpoints.create(config=endpoint_config)
218
+ print("βœ… Serving endpoint created β€” check Databricks UI for status")
219
+
220
+ except ImportError:
221
+ print("databricks-sdk not installed β€” create the endpoint via Databricks UI instead")
222
+
223
+ # COMMAND ----------
224
+ # MAGIC %md
225
+ # MAGIC ## 6. Test the endpoint
226
+
227
+ # COMMAND ----------
228
+
229
+ import requests
230
+ import json
231
+
232
+ ENDPOINT_URL = "https://<your-workspace>.azuredatabricks.net/serving-endpoints/de-knowledge-assistant/invocations"
233
+ TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
234
+
235
+ test_payload = {
236
+ "dataframe_records": [
237
+ {
238
+ "message": "Explain the Medallion architecture and give a PySpark example",
239
+ "history": "[]",
240
+ }
241
+ ]
242
+ }
243
+
244
+ response = requests.post(
245
+ ENDPOINT_URL,
246
+ headers={"Authorization": f"Bearer {TOKEN}", "Content-Type": "application/json"},
247
+ data=json.dumps(test_payload),
248
+ timeout=60,
249
+ )
250
+
251
+ print("Status:", response.status_code)
252
+ print("Response:", response.json()["predictions"][0][:500])
253
+
254
+ # COMMAND ----------
255
+ # MAGIC %md
256
+ # MAGIC ## 7. Connect the FastAPI PWA to your Databricks endpoint
257
+ # MAGIC
258
+ # MAGIC Update `app.py` β†’ replace the Groq streaming call with the Databricks endpoint:
259
+ # MAGIC
260
+ # MAGIC ```python
261
+ # MAGIC # In agent.py, add this alternative invoke method:
262
+ # MAGIC def invoke_via_databricks(self, message: str, history: list) -> str:
263
+ # MAGIC import requests, json
264
+ # MAGIC payload = {"dataframe_records": [{"message": message, "history": json.dumps(history)}]}
265
+ # MAGIC r = requests.post(
266
+ # MAGIC os.environ["DATABRICKS_ENDPOINT_URL"],
267
+ # MAGIC headers={"Authorization": f"Bearer {os.environ['DATABRICKS_TOKEN']}"},
268
+ # MAGIC json=payload, timeout=30,
269
+ # MAGIC )
270
+ # MAGIC return r.json()["predictions"][0]
271
+ # MAGIC ```
272
+ # MAGIC
273
+ # MAGIC Set `DATABRICKS_ENDPOINT_URL` and `DATABRICKS_TOKEN` in your Hugging Face Spaces secrets.
app.py ADDED
@@ -0,0 +1,190 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Data Engineering Knowledge Assistant β€” FastAPI Server
3
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
4
+ Serves:
5
+ POST /api/chat β†’ streaming SSE chat
6
+ GET /api/health β†’ readiness probe
7
+ POST /api/search β†’ raw vector search
8
+ * / β†’ PWA frontend (static/)
9
+
10
+ Deploy targets (all free):
11
+ β€’ Local : python app.py
12
+ β€’ Hugging Face : Set GROQ_API_KEY in Spaces secrets, port 7860
13
+ β€’ Databricks : See databricks/agent_notebook.py
14
+ """
15
+ from __future__ import annotations
16
+
17
+ import os
18
+ import json
19
+ from contextlib import asynccontextmanager
20
+ from typing import List, Optional
21
+
22
+ from fastapi import FastAPI, HTTPException, Query
23
+ from fastapi.middleware.cors import CORSMiddleware
24
+ from fastapi.responses import StreamingResponse
25
+ from fastapi.staticfiles import StaticFiles
26
+ from pydantic import BaseModel
27
+
28
+ # ──────────────────────────────────────────────────────────────────────────────
29
+ # Global state
30
+ # ──────────────────────────────────────────────────────────────────────────────
31
+
32
+ rag_pipeline = None
33
+ agent = None
34
+
35
+
36
+ # ──────────────────────────────────────────────────────────────────────────────
37
+ # Lifespan β€” init on startup
38
+ # ──────────────────────────────────────────────────────────────────────────────
39
+
40
+ @asynccontextmanager
41
+ async def lifespan(app: FastAPI):
42
+ global rag_pipeline, agent
43
+
44
+ from rag import DataEngineeringRAG
45
+ from agent import DataEngineeringAgent
46
+
47
+ pdf_path = os.environ.get(
48
+ "PDF_PATH", "knowledge/data_engineering_patterns.pdf"
49
+ )
50
+ groq_key = os.environ.get("GROQ_API_KEY", "")
51
+
52
+ if not groq_key:
53
+ print(
54
+ "⚠️ GROQ_API_KEY not set β€” get a free key at https://console.groq.com"
55
+ )
56
+
57
+ print("πŸš€ Starting Data Engineering Knowledge Assistant …")
58
+ rag_pipeline = DataEngineeringRAG(pdf_path=pdf_path, groq_api_key=groq_key)
59
+ rag_pipeline.initialize()
60
+
61
+ agent = DataEngineeringAgent(rag=rag_pipeline, groq_api_key=groq_key)
62
+ print("βœ… Agent ready β€” listening for requests")
63
+
64
+ yield
65
+
66
+ print("πŸ‘‹ Shutting down")
67
+
68
+
69
+ # ──────────────────────────────────────────────────────────────────────────────
70
+ # App
71
+ # ──────────────────────────────────────────────────────────────────────────────
72
+
73
+ app = FastAPI(
74
+ title="DE Knowledge Assistant",
75
+ description="Low-latency Databricks-style RAG agent for Data Engineering",
76
+ version="1.0.0",
77
+ lifespan=lifespan,
78
+ )
79
+
80
+ app.add_middleware(
81
+ CORSMiddleware,
82
+ allow_origins=["*"],
83
+ allow_methods=["*"],
84
+ allow_headers=["*"],
85
+ )
86
+
87
+
88
+ # ──────────────────────────────────────────────────────────────────────────────
89
+ # Schemas
90
+ # ──────────────────────────────────────────────────────────────────────────────
91
+
92
+ class ChatMessage(BaseModel):
93
+ role: str # "user" | "assistant"
94
+ content: str
95
+
96
+
97
+ class ChatRequest(BaseModel):
98
+ message: str
99
+ history: Optional[List[ChatMessage]] = []
100
+ stream: bool = True
101
+
102
+
103
+ class SearchRequest(BaseModel):
104
+ query: str
105
+ k: int = 5
106
+
107
+
108
+ # ──────────────────────────────────────────────────────────────────────────────
109
+ # Routes
110
+ # ──────────────────────────────────────────────────────────────────────────────
111
+
112
+ @app.get("/api/health")
113
+ async def health():
114
+ return {
115
+ "status": "healthy",
116
+ "model": "llama-3.1-8b-instant (Groq)",
117
+ "vectorstore_docs": rag_pipeline.get_doc_count() if rag_pipeline else 0,
118
+ "agent_type": "Databricks-compatible MLflow Agent",
119
+ "version": "1.0.0",
120
+ }
121
+
122
+
123
+ @app.post("/api/chat")
124
+ async def chat(req: ChatRequest):
125
+ """
126
+ Chat endpoint.
127
+
128
+ β€’ stream=true β†’ Server-Sent Events (SSE) β€” lowest perceived latency
129
+ β€’ stream=false β†’ JSON response (simpler, for testing)
130
+ """
131
+ if not agent:
132
+ raise HTTPException(503, "Agent not initialised β€” check server logs")
133
+
134
+ history = [m.model_dump() for m in req.history]
135
+
136
+ if req.stream:
137
+ async def event_stream():
138
+ try:
139
+ async for chunk in agent.astream(message=req.message, history=history):
140
+ payload = json.dumps({"chunk": chunk})
141
+ yield f"data: {payload}\n\n"
142
+ yield "data: [DONE]\n\n"
143
+ except Exception as exc:
144
+ err = json.dumps({"error": str(exc)})
145
+ yield f"data: {err}\n\n"
146
+
147
+ return StreamingResponse(
148
+ event_stream(),
149
+ media_type="text/event-stream",
150
+ headers={
151
+ "Cache-Control": "no-cache",
152
+ "X-Accel-Buffering": "no", # disable nginx buffering
153
+ },
154
+ )
155
+ else:
156
+ response = agent.invoke(message=req.message, history=history)
157
+ return {"response": response}
158
+
159
+
160
+ @app.post("/api/search")
161
+ async def search(req: SearchRequest):
162
+ """Raw semantic search β€” useful for debugging retrieval quality."""
163
+ if not rag_pipeline:
164
+ raise HTTPException(503, "RAG not initialised")
165
+ results = rag_pipeline.search(req.query, k=req.k)
166
+ return {"query": req.query, "results": results}
167
+
168
+
169
+ # ──────────────────────────────────────────────────────────────────────────────
170
+ # Static frontend β€” mount LAST so API routes take priority
171
+ # ──────────────────────────────────────────────────────────────────────────────
172
+
173
+ app.mount("/", StaticFiles(directory="static", html=True), name="static")
174
+
175
+
176
+ # ──────────────────────────────────────────────────────────────────────────────
177
+ # Entry point
178
+ # ──────────────────────────────────────────────────────────────────────────────
179
+
180
+ if __name__ == "__main__":
181
+ import uvicorn
182
+
183
+ port = int(os.environ.get("PORT", 7860)) # 7860 = HuggingFace Spaces default
184
+ uvicorn.run(
185
+ "app:app",
186
+ host="0.0.0.0",
187
+ port=port,
188
+ reload=False,
189
+ log_level="info",
190
+ )
data_engineering_patterns.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bfe5db39e0d8edc192a683ec92953e95605edf621595be85f4f5e105c80423e8
3
+ size 8372689
index.html ADDED
@@ -0,0 +1,555 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover" />
6
+ <meta name="apple-mobile-web-app-capable" content="yes" />
7
+ <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
8
+ <meta name="apple-mobile-web-app-title" content="DE Assistant" />
9
+ <meta name="theme-color" content="#1a1a2e" />
10
+ <link rel="manifest" href="/manifest.json" />
11
+ <link rel="apple-touch-icon" href="/icon-192.png" />
12
+ <title>DE Knowledge Assistant</title>
13
+ <style>
14
+ /* ── Reset & tokens ─────────────────────────────────────── */
15
+ *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
16
+
17
+ :root {
18
+ --bg: #0d0d1a;
19
+ --surface: #1a1a2e;
20
+ --surface2: #16213e;
21
+ --accent: #e94560;
22
+ --accent2: #0f3460;
23
+ --user-bg: #0f3460;
24
+ --bot-bg: #1a1a2e;
25
+ --text: #e0e0e0;
26
+ --text-dim: #888;
27
+ --border: #2a2a4a;
28
+ --green: #00c896;
29
+ --radius: 18px;
30
+ --safe-top: env(safe-area-inset-top, 0px);
31
+ --safe-bot: env(safe-area-inset-bottom, 0px);
32
+ }
33
+
34
+ html, body { height: 100%; font-family: -apple-system, BlinkMacSystemFont,
35
+ "Segoe UI", Roboto, sans-serif; background: var(--bg); color: var(--text); }
36
+
37
+ /* ── Layout ─────────────────────────────────────────────── */
38
+ #app { display: flex; flex-direction: column; height: 100dvh;
39
+ padding-top: var(--safe-top); }
40
+
41
+ /* ── Header ─────────────────────────────────────────────── */
42
+ header {
43
+ display: flex; align-items: center; gap: 12px;
44
+ padding: 14px 18px;
45
+ background: var(--surface);
46
+ border-bottom: 1px solid var(--border);
47
+ backdrop-filter: blur(20px);
48
+ -webkit-backdrop-filter: blur(20px);
49
+ }
50
+ .logo { width: 36px; height: 36px; border-radius: 10px;
51
+ background: linear-gradient(135deg, var(--accent), var(--accent2));
52
+ display: grid; place-items: center; font-size: 18px; flex-shrink: 0; }
53
+ .header-text h1 { font-size: 15px; font-weight: 700; }
54
+ .header-text p { font-size: 11px; color: var(--text-dim); }
55
+ #status-dot { width: 8px; height: 8px; border-radius: 50%;
56
+ background: #555; margin-left: auto; flex-shrink: 0; transition: background .3s; }
57
+ #status-dot.ready { background: var(--green); box-shadow: 0 0 6px var(--green); }
58
+ #status-dot.loading { background: #f5a623; animation: pulse 1s infinite; }
59
+
60
+ @keyframes pulse { 0%,100%{opacity:1} 50%{opacity:.4} }
61
+
62
+ /* ── Messages ───────────────────────────────────────────── */
63
+ #messages { flex: 1; overflow-y: auto; padding: 16px 14px;
64
+ scroll-behavior: smooth; }
65
+ #messages::-webkit-scrollbar { width: 4px; }
66
+ #messages::-webkit-scrollbar-track { background: transparent; }
67
+ #messages::-webkit-scrollbar-thumb { background: var(--border); border-radius: 2px; }
68
+
69
+ .msg { display: flex; gap: 10px; margin-bottom: 16px; animation: fadeIn .25s ease; }
70
+ @keyframes fadeIn { from{opacity:0;transform:translateY(8px)} to{opacity:1;transform:none} }
71
+
72
+ .msg.user { flex-direction: row-reverse; }
73
+ .msg.bot { flex-direction: row; }
74
+
75
+ .avatar { width: 32px; height: 32px; border-radius: 50%; display: grid;
76
+ place-items: center; font-size: 14px; flex-shrink: 0; align-self: flex-end; }
77
+ .msg.user .avatar { background: var(--accent); }
78
+ .msg.bot .avatar { background: var(--accent2); }
79
+
80
+ .bubble { max-width: min(75vw, 520px); padding: 12px 16px;
81
+ border-radius: var(--radius); line-height: 1.6; font-size: 14px;
82
+ word-wrap: break-word; }
83
+ .msg.user .bubble { background: var(--user-bg); border-bottom-right-radius: 4px; }
84
+ .msg.bot .bubble { background: var(--bot-bg); border: 1px solid var(--border);
85
+ border-bottom-left-radius: 4px; }
86
+
87
+ /* markdown-ish inside bubbles */
88
+ .bubble code { background: rgba(255,255,255,.08); padding: 2px 6px;
89
+ border-radius: 4px; font-family: "SF Mono", Menlo, monospace;
90
+ font-size: 12px; }
91
+ .bubble pre { background: #0a0a1a; border: 1px solid var(--border);
92
+ border-radius: 10px; padding: 12px; overflow-x: auto;
93
+ margin: 8px 0; }
94
+ .bubble pre code { background: none; padding: 0; font-size: 12px; }
95
+ .bubble strong { color: #fff; }
96
+ .bubble p { margin-bottom: 8px; }
97
+ .bubble p:last-child { margin-bottom: 0; }
98
+ .bubble ul { padding-left: 18px; }
99
+ .bubble li { margin-bottom: 4px; }
100
+ .bubble blockquote { border-left: 3px solid var(--accent);
101
+ padding-left: 12px; color: var(--text-dim); margin: 8px 0; }
102
+
103
+ /* sources badge */
104
+ .sources { margin-top: 10px; display: flex; flex-wrap: wrap; gap: 6px; }
105
+ .source-chip { font-size: 10px; background: rgba(15,52,96,.6);
106
+ border: 1px solid var(--accent2); border-radius: 20px;
107
+ padding: 3px 9px; color: #a0b4d0; }
108
+
109
+ /* typing indicator */
110
+ .typing-dots span { display: inline-block; width: 6px; height: 6px;
111
+ border-radius: 50%; background: var(--text-dim);
112
+ margin: 0 2px; animation: bounce .9s infinite; }
113
+ .typing-dots span:nth-child(2) { animation-delay: .15s; }
114
+ .typing-dots span:nth-child(3) { animation-delay: .3s; }
115
+ @keyframes bounce { 0%,60%,100%{transform:translateY(0)} 30%{transform:translateY(-6px)} }
116
+
117
+ /* ── Welcome card ───────────────────────────────────────── */
118
+ #welcome { text-align: center; padding: 40px 20px; }
119
+ #welcome .icon { font-size: 52px; margin-bottom: 12px; }
120
+ #welcome h2 { font-size: 20px; margin-bottom: 8px; }
121
+ #welcome p { color: var(--text-dim); font-size: 13px; max-width: 280px; margin: 0 auto 20px; }
122
+ .starters { display: flex; flex-direction: column; gap: 8px; max-width: 340px; margin: 0 auto; }
123
+ .starter-btn { background: var(--surface); border: 1px solid var(--border);
124
+ border-radius: 12px; padding: 11px 16px; color: var(--text);
125
+ font-size: 13px; cursor: pointer; text-align: left; transition: border-color .2s, background .2s; }
126
+ .starter-btn:hover { border-color: var(--accent); background: var(--surface2); }
127
+
128
+ /* ── Input bar ──────────────────────────────────────────── */
129
+ #input-bar {
130
+ display: flex; align-items: flex-end; gap: 8px;
131
+ padding: 10px 14px calc(10px + var(--safe-bot));
132
+ background: var(--surface);
133
+ border-top: 1px solid var(--border);
134
+ }
135
+
136
+ #msg-input {
137
+ flex: 1; background: var(--surface2); border: 1px solid var(--border);
138
+ border-radius: 22px; padding: 10px 16px; color: var(--text);
139
+ font-size: 15px; resize: none; min-height: 44px; max-height: 120px;
140
+ line-height: 1.4; outline: none; font-family: inherit;
141
+ transition: border-color .2s;
142
+ }
143
+ #msg-input:focus { border-color: var(--accent); }
144
+ #msg-input::placeholder { color: var(--text-dim); }
145
+
146
+ .icon-btn {
147
+ width: 44px; height: 44px; border-radius: 50%; border: none;
148
+ cursor: pointer; display: grid; place-items: center;
149
+ font-size: 18px; flex-shrink: 0; transition: transform .15s, background .2s;
150
+ }
151
+ .icon-btn:active { transform: scale(.9); }
152
+
153
+ #send-btn { background: var(--accent); }
154
+ #send-btn:disabled { background: #555; cursor: not-allowed; }
155
+
156
+ #voice-btn { background: var(--surface2); border: 1px solid var(--border); }
157
+ #voice-btn.recording {
158
+ background: var(--accent) !important;
159
+ animation: ripple 1.2s ease-out infinite;
160
+ }
161
+ @keyframes ripple {
162
+ 0% { box-shadow: 0 0 0 0 rgba(233,69,96,.6); }
163
+ 70% { box-shadow: 0 0 0 16px rgba(233,69,96,0); }
164
+ 100%{ box-shadow: 0 0 0 0 rgba(233,69,96,0); }
165
+ }
166
+
167
+ /* ── Voice overlay ──────────────────────────────────────── */
168
+ #voice-overlay {
169
+ display: none; position: fixed; inset: 0; background: rgba(0,0,0,.8);
170
+ backdrop-filter: blur(10px); flex-direction: column;
171
+ align-items: center; justify-content: center; gap: 20px; z-index: 100;
172
+ }
173
+ #voice-overlay.active { display: flex; }
174
+ #voice-wave { font-size: 64px; animation: pulse 1s infinite; }
175
+ #voice-transcript { color: #ddd; font-size: 16px; max-width: 300px;
176
+ text-align: center; min-height: 40px; }
177
+ #voice-cancel { background: var(--surface); border: 1px solid var(--border);
178
+ color: var(--text); border-radius: 50px; padding: 12px 28px;
179
+ font-size: 15px; cursor: pointer; }
180
+
181
+ /* ── Toast ──────────────────────────────────────────────── */
182
+ #toast { position: fixed; bottom: calc(90px + var(--safe-bot)); left: 50%;
183
+ transform: translateX(-50%) translateY(20px);
184
+ background: #222; border: 1px solid var(--border); border-radius: 20px;
185
+ padding: 8px 18px; font-size: 13px; opacity: 0; pointer-events: none;
186
+ transition: opacity .3s, transform .3s; white-space: nowrap; z-index: 50; }
187
+ #toast.show { opacity: 1; transform: translateX(-50%) translateY(0); }
188
+ </style>
189
+ </head>
190
+ <body>
191
+ <div id="app">
192
+ <!-- Header -->
193
+ <header>
194
+ <div class="logo">πŸ—„οΈ</div>
195
+ <div class="header-text">
196
+ <h1>DE Knowledge Assistant</h1>
197
+ <p id="header-sub">Connecting…</p>
198
+ </div>
199
+ <div id="status-dot" title="Agent status"></div>
200
+ </header>
201
+
202
+ <!-- Messages -->
203
+ <div id="messages">
204
+ <div id="welcome">
205
+ <div class="icon">⚑</div>
206
+ <h2>Ask me anything about<br>Data Engineering</h2>
207
+ <p>Powered by "Data Engineering Design Patterns" + Groq's ultra-fast inference</p>
208
+ <div class="starters">
209
+ <button class="starter-btn" onclick="sendStarter(this)">What is the Medallion architecture and when should I use it?</button>
210
+ <button class="starter-btn" onclick="sendStarter(this)">Show me a PySpark CDC (Change Data Capture) example</button>
211
+ <button class="starter-btn" onclick="sendStarter(this)">Compare Lambda vs Kappa architecture</button>
212
+ <button class="starter-btn" onclick="sendStarter(this)">How do I handle late-arriving data in streaming pipelines?</button>
213
+ </div>
214
+ </div>
215
+ </div>
216
+
217
+ <!-- Input bar -->
218
+ <div id="input-bar">
219
+ <button id="voice-btn" class="icon-btn" onclick="toggleVoice()" title="Voice input">🎀</button>
220
+ <textarea id="msg-input" rows="1" placeholder="Ask a data engineering question…"
221
+ onkeydown="handleKey(event)" oninput="autoResize(this)"></textarea>
222
+ <button id="send-btn" class="icon-btn" onclick="sendMessage()" title="Send">➀</button>
223
+ </div>
224
+ </div>
225
+
226
+ <!-- Voice overlay -->
227
+ <div id="voice-overlay">
228
+ <div id="voice-wave">πŸŽ™οΈ</div>
229
+ <div id="voice-transcript">Listening…</div>
230
+ <button id="voice-cancel" onclick="stopVoice()">βœ• Cancel</button>
231
+ </div>
232
+
233
+ <!-- Toast -->
234
+ <div id="toast"></div>
235
+
236
+ <script>
237
+ // ── State ────────────────────────────────────────────────────────────────────
238
+ const state = {
239
+ history: [],
240
+ isLoading: false,
241
+ recognition: null,
242
+ synthesis: window.speechSynthesis || null,
243
+ voiceActive: false,
244
+ currentUtterance: null,
245
+ };
246
+
247
+ const BASE_URL = window.location.origin;
248
+
249
+ // ── DOM helpers ──────────────────────────────────────────────────────────────
250
+ const $ = id => document.getElementById(id);
251
+ const msgInput = $('msg-input');
252
+ const messages = $('messages');
253
+ const sendBtn = $('send-btn');
254
+ const voiceBtn = $('voice-btn');
255
+ const statusDot = $('status-dot');
256
+ const headerSub = $('header-sub');
257
+ const welcome = $('welcome');
258
+ const voiceOverlay = $('voice-overlay');
259
+ const voiceTranscript = $('voice-transcript');
260
+
261
+ // ── Health check ─────────────────────────────────────────────────────────────
262
+ async function checkHealth() {
263
+ try {
264
+ const r = await fetch(`${BASE_URL}/api/health`);
265
+ const data = await r.json();
266
+ statusDot.className = 'ready';
267
+ headerSub.textContent = `${data.vectorstore_docs.toLocaleString()} chunks Β· Groq`;
268
+ } catch {
269
+ statusDot.className = 'loading';
270
+ headerSub.textContent = 'Agent initialising…';
271
+ setTimeout(checkHealth, 3000);
272
+ }
273
+ }
274
+ checkHealth();
275
+
276
+ // ── Markdown renderer (no deps) ───────────────────────────────────────────────
277
+ function renderMarkdown(text) {
278
+ return text
279
+ // code blocks
280
+ .replace(/```(\w*)\n?([\s\S]*?)```/g, (_, lang, code) =>
281
+ `<pre><code class="lang-${lang}">${escHtml(code.trim())}</code></pre>`)
282
+ // inline code
283
+ .replace(/`([^`]+)`/g, (_, c) => `<code>${escHtml(c)}</code>`)
284
+ // bold
285
+ .replace(/\*\*(.+?)\*\*/g, '<strong>$1</strong>')
286
+ // italic
287
+ .replace(/\*(.+?)\*/g, '<em>$1</em>')
288
+ // headings
289
+ .replace(/^### (.+)$/gm, '<p><strong>$1</strong></p>')
290
+ .replace(/^## (.+)$/gm, '<p><strong>$1</strong></p>')
291
+ // bullets
292
+ .replace(/^[-β€’] (.+)$/gm, '<li>$1</li>')
293
+ .replace(/(<li>[\s\S]+?<\/li>)/g, '<ul>$1</ul>')
294
+ // numbered lists
295
+ .replace(/^\d+\. (.+)$/gm, '<li>$1</li>')
296
+ // emoji-prefixed tips
297
+ .replace(/^(πŸ’‘.+)$/gm, '<blockquote>$1</blockquote>')
298
+ // newlines β†’ paragraphs
299
+ .replace(/\n\n+/g, '</p><p>')
300
+ .replace(/^(?!<)(.+)/gm, '$1')
301
+ .replace(/^<p><\/p>$|^<\/p><p>$/gm, '')
302
+ .trim();
303
+ }
304
+ function escHtml(s) {
305
+ return s.replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;');
306
+ }
307
+
308
+ // ── Message rendering ─────────────────────────────────────────────────────────
309
+ function appendMessage(role, content, streaming = false) {
310
+ if (welcome.style.display !== 'none' || document.getElementById('welcome')) {
311
+ if ($('welcome')) $('welcome').remove();
312
+ }
313
+
314
+ const wrap = document.createElement('div');
315
+ wrap.className = `msg ${role}`;
316
+ wrap.innerHTML = `
317
+ <div class="avatar">${role === 'user' ? 'πŸ§‘' : 'πŸ€–'}</div>
318
+ <div class="bubble" id="bubble-${Date.now()}">${
319
+ role === 'user' ? escHtml(content) : renderMarkdown(content)
320
+ }</div>`;
321
+ messages.appendChild(wrap);
322
+ messages.scrollTop = messages.scrollHeight;
323
+ return wrap.querySelector('.bubble');
324
+ }
325
+
326
+ function showTyping() {
327
+ if ($('welcome')) $('welcome').remove();
328
+ const wrap = document.createElement('div');
329
+ wrap.className = 'msg bot';
330
+ wrap.id = 'typing-indicator';
331
+ wrap.innerHTML = `
332
+ <div class="avatar">πŸ€–</div>
333
+ <div class="bubble"><div class="typing-dots"><span></span><span></span><span></span></div></div>`;
334
+ messages.appendChild(wrap);
335
+ messages.scrollTop = messages.scrollHeight;
336
+ }
337
+
338
+ function removeTyping() { $('typing-indicator')?.remove(); }
339
+
340
+ // ── Send flow ─────────────────────────────────────────────────────────────────
341
+ async function sendMessage(text) {
342
+ const msg = (text || msgInput.value).trim();
343
+ if (!msg || state.isLoading) return;
344
+
345
+ msgInput.value = '';
346
+ autoResize(msgInput);
347
+ state.isLoading = true;
348
+ sendBtn.disabled = true;
349
+ statusDot.className = 'loading';
350
+
351
+ appendMessage('user', msg);
352
+ showTyping();
353
+
354
+ // Stop any ongoing TTS
355
+ state.synthesis?.cancel();
356
+
357
+ try {
358
+ const res = await fetch(`${BASE_URL}/api/chat`, {
359
+ method: 'POST',
360
+ headers: { 'Content-Type': 'application/json' },
361
+ body: JSON.stringify({ message: msg, history: state.history, stream: true }),
362
+ });
363
+
364
+ removeTyping();
365
+
366
+ if (!res.ok) throw new Error(`Server error ${res.status}`);
367
+
368
+ const bubble = appendMessage('bot', '');
369
+ let full = '';
370
+
371
+ // Server-Sent Events streaming
372
+ const reader = res.body.getReader();
373
+ const decoder = new TextDecoder();
374
+ let buffer = '';
375
+
376
+ while (true) {
377
+ const { done, value } = await reader.read();
378
+ if (done) break;
379
+ buffer += decoder.decode(value, { stream: true });
380
+ const lines = buffer.split('\n');
381
+ buffer = lines.pop();
382
+
383
+ for (const line of lines) {
384
+ if (!line.startsWith('data: ')) continue;
385
+ const payload = line.slice(6).trim();
386
+ if (payload === '[DONE]') break;
387
+ try {
388
+ const { chunk } = JSON.parse(payload);
389
+ if (chunk) {
390
+ full += chunk;
391
+ bubble.innerHTML = renderMarkdown(full);
392
+ messages.scrollTop = messages.scrollHeight;
393
+ }
394
+ } catch { /* skip malformed */ }
395
+ }
396
+ }
397
+
398
+ // Update history
399
+ state.history.push({ role: 'user', content: msg });
400
+ state.history.push({ role: 'assistant', content: full });
401
+ if (state.history.length > 12) state.history = state.history.slice(-12);
402
+
403
+ // Auto-speak response if voice mode was used
404
+ if (state.voiceActive && state.synthesis) {
405
+ speakText(stripMarkdown(full));
406
+ }
407
+
408
+ } catch (err) {
409
+ removeTyping();
410
+ appendMessage('bot', `⚠️ Error: ${err.message}. Check that the server is running and GROQ_API_KEY is set.`);
411
+ } finally {
412
+ state.isLoading = false;
413
+ sendBtn.disabled = false;
414
+ statusDot.className = 'ready';
415
+ }
416
+ }
417
+
418
+ function sendStarter(btn) { sendMessage(btn.textContent.trim()); }
419
+
420
+ // ── Text-to-Speech ────────────────────────────────────────────────────────────
421
+ function speakText(text) {
422
+ if (!state.synthesis) return;
423
+ state.synthesis.cancel();
424
+ const utt = new SpeechSynthesisUtterance(text.slice(0, 800)); // limit TTS length
425
+ utt.rate = 1.05;
426
+ utt.pitch = 1;
427
+ // Prefer a natural English voice
428
+ const voices = state.synthesis.getVoices();
429
+ const preferred = voices.find(v => v.lang.startsWith('en') && v.localService)
430
+ || voices.find(v => v.lang.startsWith('en'))
431
+ || voices[0];
432
+ if (preferred) utt.voice = preferred;
433
+ state.synthesis.speak(utt);
434
+ }
435
+
436
+ function stripMarkdown(text) {
437
+ return text.replace(/```[\s\S]*?```/g, 'code block')
438
+ .replace(/`([^`]+)`/g, '$1')
439
+ .replace(/\*\*(.+?)\*\*/g, '$1')
440
+ .replace(/\*(.+?)\*/g, '$1')
441
+ .replace(/^#+\s/gm, '')
442
+ .replace(/^[-β€’] /gm, '')
443
+ .replace(/πŸ’‘/g, 'Pro tip:');
444
+ }
445
+
446
+ // ── Voice Input ───────────────────────────────────────────────────────────────
447
+ function setupSpeechRecognition() {
448
+ const SR = window.SpeechRecognition || window.webkitSpeechRecognition;
449
+ if (!SR) return null;
450
+
451
+ const rec = new SR();
452
+ rec.continuous = false;
453
+ rec.interimResults = true;
454
+ rec.lang = 'en-US';
455
+
456
+ rec.onstart = () => {
457
+ voiceOverlay.classList.add('active');
458
+ voiceBtn.classList.add('recording');
459
+ voiceTranscript.textContent = 'Listening…';
460
+ };
461
+
462
+ rec.onresult = e => {
463
+ let interim = '', final = '';
464
+ for (let r of e.results) {
465
+ if (r.isFinal) final += r[0].transcript;
466
+ else interim += r[0].transcript;
467
+ }
468
+ voiceTranscript.textContent = final || interim || 'Listening…';
469
+ if (final) msgInput.value = final;
470
+ };
471
+
472
+ rec.onerror = err => {
473
+ stopVoice();
474
+ if (err.error === 'not-allowed') toast('🎀 Microphone permission denied');
475
+ else toast(`Voice error: ${err.error}`);
476
+ };
477
+
478
+ rec.onend = () => {
479
+ voiceOverlay.classList.remove('active');
480
+ voiceBtn.classList.remove('recording');
481
+ state.voiceActive = false;
482
+ const text = msgInput.value.trim();
483
+ if (text) sendMessage(text);
484
+ };
485
+
486
+ return rec;
487
+ }
488
+
489
+ function toggleVoice() {
490
+ if (state.voiceActive) { stopVoice(); return; }
491
+
492
+ if (!state.recognition) {
493
+ state.recognition = setupSpeechRecognition();
494
+ }
495
+
496
+ if (!state.recognition) {
497
+ toast('🎀 Voice not supported in this browser. Try Chrome on Android or Safari on iOS 14.5+.');
498
+ return;
499
+ }
500
+
501
+ state.voiceActive = true;
502
+ try { state.recognition.start(); }
503
+ catch { state.voiceActive = false; toast('Could not start microphone'); }
504
+ }
505
+
506
+ function stopVoice() {
507
+ state.voiceActive = false;
508
+ state.recognition?.abort();
509
+ voiceOverlay.classList.remove('active');
510
+ voiceBtn.classList.remove('recording');
511
+ }
512
+
513
+ // ── UI helpers ────────────────────────────────────────────────────────────────
514
+ function autoResize(el) {
515
+ el.style.height = 'auto';
516
+ el.style.height = Math.min(el.scrollHeight, 120) + 'px';
517
+ }
518
+
519
+ function handleKey(e) {
520
+ if (e.key === 'Enter' && !e.shiftKey) {
521
+ e.preventDefault();
522
+ sendMessage();
523
+ }
524
+ }
525
+
526
+ function toast(msg, duration = 3000) {
527
+ const t = $('toast');
528
+ t.textContent = msg;
529
+ t.classList.add('show');
530
+ setTimeout(() => t.classList.remove('show'), duration);
531
+ }
532
+
533
+ // ── PWA install banner ────────────────────────────────────────────────────────
534
+ let deferredPrompt;
535
+ window.addEventListener('beforeinstallprompt', e => {
536
+ e.preventDefault();
537
+ deferredPrompt = e;
538
+ setTimeout(() => {
539
+ toast('πŸ“² Add to Home Screen for the best experience!', 5000);
540
+ }, 3000);
541
+ });
542
+
543
+ // ── Service worker ────────────────────────────────────────────────────────────
544
+ if ('serviceWorker' in navigator) {
545
+ navigator.serviceWorker.register('/sw.js').catch(() => {});
546
+ }
547
+
548
+ // Load voices async (required for some browsers)
549
+ if (state.synthesis) {
550
+ state.synthesis.onvoiceschanged = () => state.synthesis.getVoices();
551
+ state.synthesis.getVoices();
552
+ }
553
+ </script>
554
+ </body>
555
+ </html>
manifest.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "DE Knowledge Assistant",
3
+ "short_name": "DE Assistant",
4
+ "description": "Low-latency AI agent for Data Engineering Design Patterns β€” voice-enabled, Databricks-compatible",
5
+ "start_url": "/",
6
+ "display": "standalone",
7
+ "background_color": "#0d0d1a",
8
+ "theme_color": "#1a1a2e",
9
+ "orientation": "portrait-primary",
10
+ "scope": "/",
11
+ "lang": "en",
12
+ "categories": ["productivity", "education", "developer tools"],
13
+ "icons": [
14
+ {
15
+ "src": "data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 192 192'><rect width='192' height='192' rx='40' fill='%231a1a2e'/><text x='50%25' y='55%25' font-size='100' text-anchor='middle' dominant-baseline='middle'>πŸ—„οΈ</text></svg>",
16
+ "sizes": "192x192",
17
+ "type": "image/svg+xml",
18
+ "purpose": "any maskable"
19
+ },
20
+ {
21
+ "src": "data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 512 512'><rect width='512' height='512' rx='100' fill='%231a1a2e'/><text x='50%25' y='55%25' font-size='280' text-anchor='middle' dominant-baseline='middle'>πŸ—„οΈ</text></svg>",
22
+ "sizes": "512x512",
23
+ "type": "image/svg+xml",
24
+ "purpose": "any maskable"
25
+ }
26
+ ],
27
+ "screenshots": [],
28
+ "shortcuts": [
29
+ {
30
+ "name": "Ask a Question",
31
+ "short_name": "Ask",
32
+ "description": "Open the chat interface",
33
+ "url": "/"
34
+ }
35
+ ],
36
+ "prefer_related_applications": false
37
+ }
rag.py ADDED
@@ -0,0 +1,170 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ RAG Pipeline β€” Data Engineering Knowledge Assistant
3
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
4
+ Strategy : PDF β†’ chunked β†’ HuggingFace MiniLM embeddings β†’ ChromaDB (in-memory)
5
+ LLM : Groq llama-3.1-8b-instant (sub-500ms response, free tier)
6
+ Compat : Works standalone OR registered as an MLflow PyFunc model on Databricks
7
+ """
8
+ from __future__ import annotations
9
+
10
+ import os
11
+ from pathlib import Path
12
+ from typing import List, Dict
13
+
14
+
15
+ # ──────────────────────────────────────────────────────────────────────────────
16
+ # Core RAG class
17
+ # ──────────────────────────────────────────────────────────────────────────────
18
+
19
+ class DataEngineeringRAG:
20
+ """
21
+ Retrieval-Augmented Generation pipeline tuned for data-engineering content.
22
+
23
+ Usage (standalone):
24
+ rag = DataEngineeringRAG(pdf_path="knowledge/data_engineering_patterns.pdf",
25
+ groq_api_key=os.environ["GROQ_API_KEY"])
26
+ rag.initialize()
27
+ print(rag.search("What is the Lambda architecture?"))
28
+
29
+ Usage (Databricks):
30
+ Register via mlflow.pyfunc.log_model β€” see databricks/agent_notebook.py
31
+ """
32
+
33
+ def __init__(self, pdf_path: str, groq_api_key: str):
34
+ self.pdf_path = Path(pdf_path)
35
+ self.groq_api_key = groq_api_key
36
+ self.vectorstore = None
37
+ self.retriever = None
38
+ self._doc_count = 0
39
+ self._initialized = False
40
+
41
+ # ── public ────────────────────────────────────────────────────────────────
42
+
43
+ def initialize(self) -> None:
44
+ """Load PDF β†’ embed β†’ store. Safe to call multiple times (idempotent)."""
45
+ if self._initialized:
46
+ return
47
+
48
+ if not self.pdf_path.exists():
49
+ print(f"⚠️ PDF not found at '{self.pdf_path}' β€” running in demo mode.")
50
+ self._demo_mode()
51
+ return
52
+
53
+ self._build_vectorstore()
54
+ self._initialized = True
55
+
56
+ def search(self, query: str, k: int = 5) -> List[Dict]:
57
+ """Return ranked chunks relevant to *query*."""
58
+ if not self.vectorstore:
59
+ return []
60
+
61
+ docs_scores = self.vectorstore.similarity_search_with_score(query, k=k)
62
+ return [
63
+ {
64
+ "content": doc.page_content,
65
+ "source": doc.metadata.get("source", "pdf"),
66
+ "page": doc.metadata.get("page", 0),
67
+ "score": round(1 - float(score), 4), # convert distance β†’ similarity
68
+ }
69
+ for doc, score in docs_scores
70
+ ]
71
+
72
+ def get_retriever(self):
73
+ return self.retriever
74
+
75
+ def get_doc_count(self) -> int:
76
+ return self._doc_count
77
+
78
+ # ── private ───────────────────────────────────────────────────────────────
79
+
80
+ def _build_vectorstore(self) -> None:
81
+ from langchain_community.document_loaders import PyPDFLoader
82
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
83
+ from langchain_community.vectorstores import Chroma
84
+ from langchain_community.embeddings import HuggingFaceEmbeddings
85
+
86
+ print(f"πŸ“š Loading '{self.pdf_path.name}' …")
87
+ loader = PyPDFLoader(str(self.pdf_path))
88
+ documents = loader.load()
89
+ print(f" β†’ {len(documents)} pages loaded")
90
+
91
+ # ── Chunk ──────────────────────────────────────────────────────────
92
+ # Smaller chunks (800 chars) with generous overlap keep context intact
93
+ # for technical patterns that often span several paragraphs.
94
+ splitter = RecursiveCharacterTextSplitter(
95
+ chunk_size=800,
96
+ chunk_overlap=160,
97
+ separators=["\n\n", "\n", ". ", "! ", "? ", ", ", " "],
98
+ )
99
+ chunks = splitter.split_documents(documents)
100
+ print(f" β†’ {len(chunks)} chunks created")
101
+
102
+ # ── Embed ──────────────────────────────────────────────────────────
103
+ # all-MiniLM-L6-v2 : 22 MB, CPU-friendly, strong semantic accuracy
104
+ print("πŸ”’ Embedding chunks (CPU, ~30–60 s on first run) …")
105
+ embeddings = HuggingFaceEmbeddings(
106
+ model_name="sentence-transformers/all-MiniLM-L6-v2",
107
+ model_kwargs={"device": "cpu"},
108
+ encode_kwargs={"normalize_embeddings": True},
109
+ )
110
+
111
+ # ── Store ──────────────────────────────────────────────────────────
112
+ # Chroma in-memory β€” no disk I/O, works on HF Spaces free tier
113
+ self.vectorstore = Chroma.from_documents(
114
+ documents=chunks,
115
+ embedding=embeddings,
116
+ collection_name="de_patterns",
117
+ )
118
+
119
+ # MMR retriever: diversity + relevance
120
+ self.retriever = self.vectorstore.as_retriever(
121
+ search_type="mmr",
122
+ search_kwargs={"k": 5, "fetch_k": 20, "lambda_mult": 0.6},
123
+ )
124
+
125
+ self._doc_count = len(chunks)
126
+ print(f"βœ… Vector store ready β€” {self._doc_count} chunks indexed")
127
+
128
+ def _demo_mode(self) -> None:
129
+ """Lightweight fallback when PDF is missing (useful for CI / testing)."""
130
+ from langchain_community.vectorstores import Chroma
131
+ from langchain_community.embeddings import HuggingFaceEmbeddings
132
+ from langchain.schema import Document
133
+
134
+ demo_docs = [
135
+ Document(
136
+ page_content=(
137
+ "The Lambda Architecture splits processing into three layers: "
138
+ "batch, speed, and serving. The batch layer reprocesses all historical "
139
+ "data; the speed layer handles real-time incremental updates; the serving "
140
+ "layer merges both for query."
141
+ ),
142
+ metadata={"source": "demo", "page": 0},
143
+ ),
144
+ Document(
145
+ page_content=(
146
+ "The Kappa Architecture simplifies Lambda by removing the batch layer. "
147
+ "All data flows through a single streaming path. Historical reprocessing "
148
+ "is done by replaying the event log."
149
+ ),
150
+ metadata={"source": "demo", "page": 1},
151
+ ),
152
+ Document(
153
+ page_content=(
154
+ "A Data Lakehouse combines the flexibility of a data lake with the "
155
+ "structure and ACID guarantees of a data warehouse. Formats like Delta Lake, "
156
+ "Apache Iceberg, and Apache Hudi implement this pattern."
157
+ ),
158
+ metadata={"source": "demo", "page": 2},
159
+ ),
160
+ ]
161
+
162
+ embeddings = HuggingFaceEmbeddings(
163
+ model_name="sentence-transformers/all-MiniLM-L6-v2",
164
+ model_kwargs={"device": "cpu"},
165
+ )
166
+ self.vectorstore = Chroma.from_documents(demo_docs, embedding=embeddings)
167
+ self.retriever = self.vectorstore.as_retriever(search_kwargs={"k": 3})
168
+ self._doc_count = len(demo_docs)
169
+ self._initialized = True
170
+ print("βœ… Demo mode active β€” 3 built-in DE patterns loaded")
requirements.txt ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Data Engineering Knowledge Assistant β€” Dependencies
2
+ # All open-source, all free tier compatible
3
+
4
+ # ── Web framework ─────────────────────────────────────────────────────────────
5
+ fastapi>=0.111.0
6
+ uvicorn[standard]>=0.30.0
7
+ python-multipart>=0.0.9
8
+
9
+ # ── LLM (Groq β€” free tier, ultra-low latency) ────────────────────────────────
10
+ groq>=0.9.0
11
+
12
+ # ── RAG pipeline ──────────────────────────────────────────────────────────────
13
+ langchain>=0.2.0
14
+ langchain-community>=0.2.0
15
+ pypdf>=4.0.0
16
+
17
+ # ── Vector store ──────────────────────────────────────────────────────────────
18
+ chromadb>=0.5.0
19
+
20
+ # ── Embeddings (free, runs on CPU) ───────────────────────────────────────────
21
+ sentence-transformers>=3.0.0
22
+ torch>=2.0.0 # CPU-only β€” HF Spaces free tier doesn't need GPU
23
+ transformers>=4.40.0
24
+
25
+ # ── MLflow (Databricks compatibility) ────────────────────────────────────────
26
+ mlflow>=2.12.0
27
+
28
+ # ── Utilities ─────────────────────────────────────────────────────────────────
29
+ python-dotenv>=1.0.0
30
+ pydantic>=2.0.0
31
+ httpx>=0.27.0 # async HTTP client for testing
setup.sh ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
3
+ # DE Knowledge Assistant β€” One-command Local Setup
4
+ # Usage: chmod +x setup.sh && ./setup.sh
5
+ # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
6
+ set -e
7
+
8
+ echo ""
9
+ echo "πŸ—„οΈ Data Engineering Knowledge Assistant Setup"
10
+ echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
11
+ echo ""
12
+
13
+ # ── 1. Python check ───────────────────────────────────────
14
+ PYTHON=$(python3 --version 2>&1 | awk '{print $2}')
15
+ echo "βœ“ Python $PYTHON found"
16
+
17
+ # ── 2. Virtual environment ────────────────────────────────
18
+ if [ ! -d ".venv" ]; then
19
+ echo "β†’ Creating virtual environment…"
20
+ python3 -m venv .venv
21
+ fi
22
+ source .venv/bin/activate
23
+ echo "βœ“ Virtual environment activated"
24
+
25
+ # ── 3. Install dependencies ───────────────────────────────
26
+ echo "β†’ Installing dependencies (this takes ~2 min on first run)…"
27
+ pip install -q --upgrade pip
28
+ pip install -q -r requirements.txt
29
+ echo "βœ“ Dependencies installed"
30
+
31
+ # ── 4. Environment variables ──────────────────────────────
32
+ if [ ! -f ".env" ]; then
33
+ cp .env.example .env
34
+ echo ""
35
+ echo "⚠️ ACTION REQUIRED:"
36
+ echo " Edit .env and add your free Groq API key."
37
+ echo " Get one at: https://console.groq.com (takes 30 seconds)"
38
+ echo ""
39
+ if command -v open &>/dev/null; then open https://console.groq.com; fi
40
+ read -p " Press Enter after you've added your GROQ_API_KEY to .env…" -r
41
+ fi
42
+ echo "βœ“ Environment configured"
43
+
44
+ # ── 5. Start server ───────────────────────────────────────
45
+ echo ""
46
+ echo "πŸš€ Starting DE Knowledge Assistant…"
47
+ echo " First run will download the embedding model (~90 MB) and index the PDF."
48
+ echo " This takes about 60 seconds. Subsequent starts are instant."
49
+ echo ""
50
+ echo " Open http://localhost:8000 in your browser"
51
+ echo " On iPhone: open Safari β†’ http://your-local-ip:8000 β†’ Share β†’ Add to Home Screen"
52
+ echo ""
53
+
54
+ export $(grep -v '^#' .env | xargs)
55
+ export PORT=8000
56
+ python app.py
sw.js ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /**
2
+ * Service Worker β€” DE Knowledge Assistant PWA
3
+ * Strategy: Cache-first for static assets, network-first for API calls
4
+ * This enables "Add to Home Screen" on iOS Safari and offline shell loading
5
+ */
6
+
7
+ const CACHE = 'de-assistant-v1';
8
+ const STATIC_ASSETS = ['/', '/index.html', '/manifest.json'];
9
+
10
+ // ── Install ──────────────────────────────────────────────────────────────────
11
+ self.addEventListener('install', event => {
12
+ event.waitUntil(
13
+ caches.open(CACHE).then(cache => cache.addAll(STATIC_ASSETS))
14
+ );
15
+ self.skipWaiting();
16
+ });
17
+
18
+ // ── Activate ─────────────────────────────────────────────────────────────────
19
+ self.addEventListener('activate', event => {
20
+ event.waitUntil(
21
+ caches.keys().then(keys =>
22
+ Promise.all(keys.filter(k => k !== CACHE).map(k => caches.delete(k)))
23
+ )
24
+ );
25
+ self.clients.claim();
26
+ });
27
+
28
+ // ── Fetch ─────────────────────────────────────────────────────────────────────
29
+ self.addEventListener('fetch', event => {
30
+ const { request } = event;
31
+ const url = new URL(request.url);
32
+
33
+ // API calls β†’ always network (never cache LLM responses)
34
+ if (url.pathname.startsWith('/api/')) {
35
+ event.respondWith(fetch(request));
36
+ return;
37
+ }
38
+
39
+ // Static assets β†’ cache-first, fall back to network
40
+ event.respondWith(
41
+ caches.match(request).then(cached => {
42
+ if (cached) return cached;
43
+ return fetch(request).then(response => {
44
+ if (response.ok) {
45
+ const clone = response.clone();
46
+ caches.open(CACHE).then(cache => cache.put(request, clone));
47
+ }
48
+ return response;
49
+ });
50
+ }).catch(() => caches.match('/index.html'))
51
+ );
52
+ });