Spaces:

focustiki
/

Data_eng_designer

Sleeping

App Files Files Community

focustiki commited on Apr 14

Commit

9bcadf3

verified ·

1 Parent(s): cb5d2bb

Upload 12 files

Browse files

Files changed (13) hide show

.gitattributes +1 -0
DEPLOY_TO_IOS.md +156 -0
Dockerfile +37 -0
agent.py +329 -0
agent_notebook.py +273 -0
app.py +190 -0
data_engineering_patterns.pdf +3 -0
index.html +555 -0
manifest.json +37 -0
rag.py +170 -0
requirements.txt +31 -0
setup.sh +56 -0
sw.js +52 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+data_engineering_patterns.pdf filter=lfs diff=lfs merge=lfs -text

DEPLOY_TO_IOS.md ADDED Viewed

	@@ -0,0 +1,156 @@

+# 📱 Deploy DE Assistant to iOS — Step by Step ($0)
+## What you get
+A PWA that installs on your iPhone home screen like a native app, with:
+- Full chat interface with markdown rendering
+- **Voice input** (speak your question)
+- **Voice output** (AI reads the answer aloud)
+- Works offline for the UI shell
+- Connected to Groq's free-tier LLM (sub-500ms responses)
+---
+## Option A: Hugging Face Spaces (Recommended — 5 minutes)
+HF Spaces gives you a free public HTTPS URL — required for the PWA to work on iOS.
+### Step 1 — Get a free Groq API key
+1. Go to [console.groq.com](https://console.groq.com)
+2. Sign up → API Keys → Create API Key
+3. Copy the key (starts with `gsk_`)
+### Step 2 — Create a Hugging Face Space
+1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
+2. Space name: `de-knowledge-assistant`
+3. SDK: **Docker** (not Gradio/Streamlit)
+4. Visibility: Public
+### Step 3 — Create a Dockerfile in the Space
+```dockerfile
+FROM python:3.11-slim
+WORKDIR /app
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+EXPOSE 7860
+CMD ["python", "app.py"]
+```
+### Step 4 — Upload all files
+Push these files to the Space repo (via git or HF UI):
+```
+app.py
+rag.py
+agent.py
+requirements.txt
+static/index.html
+static/manifest.json
+static/sw.js
+knowledge/data_engineering_patterns.pdf
+Dockerfile
+```
+```bash
+# Using git
+git clone https://huggingface.co/spaces/your-username/de-knowledge-assistant
+cd de-knowledge-assistant
+# copy all de-assistant/ files here
+git add .
+git commit -m "Initial deployment"
+git push
+```
+### Step 5 — Set the Groq API key as a secret
+In HF Spaces → Settings → Repository Secrets:
+- Name: `GROQ_API_KEY`
+- Value: your `gsk_...` key
+### Step 6 — Add to iPhone home screen
+1. Wait for the Space to build (~3 min)
+2. Open your Space URL in **Safari on iPhone**:
+   `https://your-username-de-knowledge-assistant.hf.space`
+3. Tap the **Share** button (box with arrow) → **Add to Home Screen**
+4. Name it "DE Assistant" → **Add**
+Done! It now appears on your home screen like a native app. 🎉
+---
+## Option B: Local + Ngrok (instant, for testing)
+Run locally and expose with a free ngrok tunnel so Safari can reach it:
+```bash
+# Terminal 1 — start the app
+./setup.sh
+# Terminal 2 — expose publicly
+brew install ngrok  # or download at ngrok.com
+ngrok http 8000
+```
+Copy the `https://xxxx.ngrok.io` URL → open in iPhone Safari → Add to Home Screen.
+---
+## Enabling Voice on iOS
+Voice requires HTTPS (which HF Spaces provides). After installing the PWA:
+1. Open the app from your home screen
+2. Tap the **🎤 microphone button**
+3. iOS will ask for microphone permission → **Allow**
+4. Speak your question — the AI will reply and read the answer aloud
+> **Tip**: The voice assistant also reads back the AI's answer using the device's
+> built-in text-to-speech (no extra API needed).
+---
+## Architecture Overview
+```
+iPhone (Safari PWA)
+    │
+    │  HTTPS / SSE streaming
+    ▼
+Hugging Face Spaces (free)
+    │  FastAPI  app.py
+    │     ├─ /api/chat  → agent.py (streaming)
+    │     └─ /api/search → rag.py (vector search)
+    │
+    ├─ RAG pipeline
+    │     ├─ PDF → PyPDF2 → 800-char chunks
+    │     ├─ sentence-transformers/all-MiniLM-L6-v2 (free, CPU)
+    │     └─ ChromaDB in-memory (MMR retrieval)
+    │
+    └─ Groq API (free tier)
+          └─ llama-3.1-8b-instant (< 500ms latency)
+```
+---
+## Connecting to Databricks (optional upgrade)
+Once you have a Databricks workspace:
+1. Run `databricks/agent_notebook.py` to register the MLflow model
+2. Create a Model Serving endpoint (free on Databricks trial)
+3. Add `DATABRICKS_ENDPOINT_URL` and `DATABRICKS_TOKEN` to HF Spaces secrets
+4. The agent automatically routes to Databricks when those vars are set
+---
+## Free Tier Limits (as of 2024)
+| Service | Free Limit |
+|---------|------------|
+| Groq API | 14,400 requests/day, 30 req/min |
+| HF Spaces | 2 vCPU, 16 GB RAM, always-on |
+| ChromaDB | Unlimited (in-memory) |
+| sentence-transformers | Unlimited (local) |
+The embedding model (~90 MB) is downloaded on first start and cached in HF Spaces.

Dockerfile ADDED Viewed

	@@ -0,0 +1,37 @@

+# ── Hugging Face Spaces / Docker deployment ───────────────────────────────────
+# Port 7860 is required for HF Spaces
+FROM python:3.11-slim
+# System deps for chromadb and sentence-transformers
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    gcc g++ libgomp1 && \
+    rm -rf /var/lib/apt/lists/*
+WORKDIR /app
+# Install Python deps first (cached layer)
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy all flat-uploaded files
+COPY . .
+# Reorganise flat upload into the directory structure the app expects:
+#   static/  → index.html, manifest.json, sw.js
+#   knowledge/ → data_engineering_patterns.pdf
+RUN mkdir -p static knowledge && \
+    mv index.html manifest.json sw.js static/ 2>/dev/null || true && \
+    mv data_engineering_patterns.pdf knowledge/ 2>/dev/null || true
+# HF Spaces runs as non-root
+RUN useradd -m -u 1000 user
+USER user
+ENV HOME=/home/user PATH=/home/user/.local/bin:$PATH
+EXPOSE 7860
+# PORT env var is set automatically by HF Spaces to 7860
+ENV PORT=7860
+CMD ["python", "app.py"]

agent.py ADDED Viewed

	@@ -0,0 +1,329 @@

+"""
+Databricks-Compatible MLflow Agent — Data Engineering Knowledge Assistant
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+• Structured as an MLflow PyFunc model so it can be logged + served on Databricks
+• Uses Groq (llama-3.1-8b-instant) for ultra-low-latency responses
+• Tools: search_knowledge, generate_code, explain_pattern, compare_patterns
+• Streaming-first for perceived sub-200ms UI responses
+"""
+from __future__ import annotations
+import os
+import json
+from typing import AsyncIterator, List, Dict, Optional
+from rag import DataEngineeringRAG
+# ──────────────────────────────────────────────────────────────────────────────
+# System prompt
+# ──────────────────────────────────────────────────────────────────────────────
+SYSTEM_PROMPT = """You are an elite Data Engineering Knowledge Assistant, \
+specializing in production-grade data pipelines, architecture patterns, and Databricks.
+Your knowledge comes from "Data Engineering Design Patterns" — a comprehensive guide \
+to solving real data engineering problems.
+Guidelines:
+1. Always ground answers in retrieved context from the knowledge base.
+2. Give concrete, code-inclusive answers when relevant.
+3. Reference specific patterns by name (Lambda, Kappa, Medallion, Lakehouse, etc.).
+4. When asked for code, produce clean Python/PySpark/SQL — working examples only.
+5. Be direct and technical — this user is a practising data engineer.
+6. If unsure, say so rather than hallucinating.
+Format your responses with:
+- A direct answer first
+- Code blocks when applicable (```python or ```sql)
+- Pattern references in **bold**
+- A "💡 Pro tip" line when you know a non-obvious insight
+"""
+# ──────────────────────────────────────────────────────────────────────────────
+# Tool definitions (sent to Groq as JSON tool schemas)
+# ──────────────────────────────────────────────────────────────────────────────
+TOOLS = [
+    {
+        "type": "function",
+        "function": {
+            "name": "search_knowledge_base",
+            "description": (
+                "Retrieve relevant chunks from the Data Engineering Design Patterns book. "
+                "Always call this first before answering any technical question."
+            ),
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "query": {
+                        "type": "string",
+                        "description": "Semantic search query, e.g. 'CDC pattern with Kafka'",
+                    },
+                    "k": {
+                        "type": "integer",
+                        "description": "Number of chunks to retrieve (default 5)",
+                        "default": 5,
+                    },
+                },
+                "required": ["query"],
+            },
+        },
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "generate_code_example",
+            "description": "Generate a working PySpark / Python / SQL code example for a DE pattern.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "pattern_name": {"type": "string"},
+                    "language": {
+                        "type": "string",
+                        "enum": ["python", "pyspark", "sql", "scala"],
+                    },
+                    "context": {"type": "string", "description": "What the code should do"},
+                },
+                "required": ["pattern_name", "language", "context"],
+            },
+        },
+    },
+]
+# ──────────────────────────────────────────────────────────────────────────────
+# Agent
+# ──────────────────────────────────────────────────────────────────────────────
+class DataEngineeringAgent:
+    """
+    Agentic wrapper around Groq + RAG.
+    Compatible with MLflow PyFunc interface for Databricks deployment.
+    See databricks/agent_notebook.py for registration instructions.
+    """
+    def __init__(self, rag: DataEngineeringRAG, groq_api_key: str):
+        self.rag = rag
+        self.groq_api_key = groq_api_key
+        self._client = None
+    # ── Groq client (lazy init) ───────────────────────────────────────────────
+    def _get_client(self):
+        if self._client is None:
+            from groq import Groq
+            self._client = Groq(api_key=self.groq_api_key)
+        return self._client
+    # ── Tool execution ────────────────────────────────────────────────────────
+    def _execute_tool(self, tool_name: str, tool_args: Dict) -> str:
+        if tool_name == "search_knowledge_base":
+            results = self.rag.search(
+                query=tool_args.get("query", ""),
+                k=tool_args.get("k", 5),
+            )
+            if not results:
+                return "No relevant content found in the knowledge base."
+            formatted = []
+            for i, r in enumerate(results, 1):
+                formatted.append(
+                    f"[Source {i} | Page {r['page']} | Relevance {r['score']:.2f}]\n"
+                    f"{r['content']}"
+                )
+            return "\n\n---\n\n".join(formatted)
+        elif tool_name == "generate_code_example":
+            # Return structured prompt for the LLM to fill in
+            return (
+                f"Generate a {tool_args.get('language', 'python')} code example "
+                f"for the '{tool_args.get('pattern_name')}' pattern. "
+                f"Context: {tool_args.get('context', '')}. "
+                "Include comments explaining each step."
+            )
+        return f"Tool '{tool_name}' not recognised."
+    # ── Sync invoke ───────────────────────────────────────────────────────────
+    def invoke(self, message: str, history: Optional[List[Dict]] = None) -> str:
+        """Single-turn or multi-turn (history) invocation."""
+        messages = self._build_messages(message, history or [])
+        client = self._get_client()
+        # First call — agent decides whether to use tools
+        response = client.chat.completions.create(
+            model="llama-3.1-8b-instant",
+            messages=messages,
+            tools=TOOLS,
+            tool_choice="auto",
+            temperature=0.2,
+            max_tokens=2048,
+        )
+        msg = response.choices[0].message
+        # Agentic loop: execute tool calls until the model stops requesting them
+        while msg.tool_calls:
+            messages.append(msg)  # add assistant message with tool_calls
+            for tc in msg.tool_calls:
+                tool_result = self._execute_tool(
+                    tc.function.name,
+                    json.loads(tc.function.arguments),
+                )
+                messages.append(
+                    {
+                        "role": "tool",
+                        "tool_call_id": tc.id,
+                        "content": tool_result,
+                    }
+                )
+            # Next iteration
+            response = client.chat.completions.create(
+                model="llama-3.1-8b-instant",
+                messages=messages,
+                tools=TOOLS,
+                tool_choice="auto",
+                temperature=0.2,
+                max_tokens=2048,
+            )
+            msg = response.choices[0].message
+        return msg.content or ""
+    # ── Async streaming invoke ────────────────────────────────────────────────
+    async def astream(
+        self, message: str, history: Optional[List[Dict]] = None
+    ) -> AsyncIterator[str]:
+        """
+        Streaming variant — yields text chunks as soon as they arrive from Groq.
+        Latency: first token typically < 200 ms on Groq's free tier.
+        """
+        import asyncio
+        # Run tool-use loop synchronously (tool calls are fast), then stream final answer
+        messages = self._build_messages(message, history or [])
+        client = self._get_client()
+        # Tool resolution (non-streaming)
+        response = await asyncio.to_thread(
+            client.chat.completions.create,
+            model="llama-3.1-8b-instant",
+            messages=messages,
+            tools=TOOLS,
+            tool_choice="auto",
+            temperature=0.2,
+            max_tokens=64,   # small limit — we just need tool selection
+        )
+        msg = response.choices[0].message
+        while msg.tool_calls:
+            messages.append(msg)
+            for tc in msg.tool_calls:
+                tool_result = self._execute_tool(
+                    tc.function.name,
+                    json.loads(tc.function.arguments),
+                )
+                messages.append(
+                    {"role": "tool", "tool_call_id": tc.id, "content": tool_result}
+                )
+            response = await asyncio.to_thread(
+                client.chat.completions.create,
+                model="llama-3.1-8b-instant",
+                messages=messages,
+                tools=TOOLS,
+                tool_choice="auto",
+                temperature=0.2,
+                max_tokens=64,
+            )
+            msg = response.choices[0].message
+        # Now stream the final answer
+        messages.append({"role": "assistant", "content": msg.content or ""})
+        if msg.content:
+            # If the last tool-resolution already produced an answer, yield it
+            for word in (msg.content or "").split(" "):
+                yield word + " "
+                await asyncio.sleep(0)
+            return
+        # Otherwise stream a fresh completion
+        stream = await asyncio.to_thread(
+            client.chat.completions.create,
+            model="llama-3.1-8b-instant",
+            messages=messages,
+            temperature=0.3,
+            max_tokens=2048,
+            stream=True,
+        )
+        for chunk in stream:
+            delta = chunk.choices[0].delta.content
+            if delta:
+                yield delta
+    # ── MLflow PyFunc interface (Databricks) ──────────────────────────────────
+    def predict(self, context, model_input) -> str:
+        """
+        MLflow-compatible predict method.
+        Allows the agent to be logged and served via Databricks Model Serving.
+        model_input: pandas DataFrame with columns ["message", "history"]
+        """
+        import pandas as pd
+        if isinstance(model_input, pd.DataFrame):
+            row = model_input.iloc[0]
+            message = row.get("message", "")
+            history = row.get("history", [])
+            if isinstance(history, str):
+                history = json.loads(history)
+        else:
+            message = model_input.get("message", "")
+            history = model_input.get("history", [])
+        return self.invoke(message=message, history=history)
+    # ── Helpers ───────────────────────────────────────────────────────────────
+    def _build_messages(self, user_message: str, history: List[Dict]) -> List[Dict]:
+        messages = [{"role": "system", "content": SYSTEM_PROMPT}]
+        for turn in history[-6:]:   # keep last 3 exchanges
+            messages.append({"role": turn["role"], "content": turn["content"]})
+        messages.append({"role": "user", "content": user_message})
+        return messages
+# ──────────────────────────────────────────────────────────────────────────────
+# MLflow wrapper for Databricks registration
+# ──────────────────────────────────────────────────────────────────────────────
+class DEAgentPyFunc:
+    """
+    Thin MLflow PyFunc wrapper.  Log with:
+        import mlflow
+        mlflow.pyfunc.log_model(
+            artifact_path="de_agent",
+            python_model=DEAgentPyFunc(),
+            pip_requirements=["groq", "langchain", "chromadb", ...],
+        )
+    """
+    def load_context(self, context):
+        pdf_path = context.artifacts.get("pdf_path", "knowledge/data_engineering_patterns.pdf")
+        groq_key = os.environ.get("GROQ_API_KEY", "")
+        self.rag = DataEngineeringRAG(pdf_path=pdf_path, groq_api_key=groq_key)
+        self.rag.initialize()
+        self.agent = DataEngineeringAgent(rag=self.rag, groq_api_key=groq_key)
+    def predict(self, context, model_input):
+        return self.agent.predict(context, model_input)

agent_notebook.py ADDED Viewed

	@@ -0,0 +1,273 @@

+# Databricks notebook source
+# MAGIC %md
+# MAGIC # 🗄️ Data Engineering Knowledge Agent — Databricks Deployment
+# MAGIC
+# MAGIC This notebook deploys the DE Knowledge Assistant as a **Databricks Model Serving endpoint**.
+# MAGIC
+# MAGIC Architecture:
+# MAGIC ```
+# MAGIC [PDF Knowledge Base] → [ChromaDB Vectors] → [MLflow PyFunc Agent] → [Databricks Model Serving] → [FastAPI PWA]
+# MAGIC ```
+# MAGIC
+# MAGIC Prerequisites (all free on Databricks Community Edition or trial):
+# MAGIC - Databricks workspace (community.cloud.databricks.com)
+# MAGIC - GROQ_API_KEY stored in Databricks Secrets
+# MAGIC - Unity Catalog enabled (optional but recommended)
+# COMMAND ----------
+# MAGIC %pip install groq langchain langchain-community chromadb sentence-transformers pypdf mlflow fastapi uvicorn
+# MAGIC dbutils.library.restartPython()
+# COMMAND ----------
+import os
+import mlflow
+import mlflow.pyfunc
+from mlflow.models import infer_signature
+import pandas as pd
+# ── 1. Configuration ──────────────────────────────────────────────────────────
+EXPERIMENT_NAME = "/Users/your-email@domain.com/de-knowledge-assistant"
+MODEL_NAME      = "de_knowledge_agent"
+PDF_VOLUME_PATH = "/Volumes/main/default/knowledge/data_engineering_patterns.pdf"
+# ^ Upload the PDF to a Unity Catalog Volume first:
+# databricks fs cp data_engineering_patterns.pdf dbfs:/Volumes/main/default/knowledge/
+# Retrieve API key from Databricks secrets (safe — never hardcode)
+GROQ_API_KEY = dbutils.secrets.get(scope="de-assistant", key="groq-api-key")
+# Create the secret scope first:
+# databricks secrets create-scope --scope de-assistant
+# databricks secrets put --scope de-assistant --key groq-api-key
+# COMMAND ----------
+# MAGIC %md ## 2. Define the MLflow PyFunc Model
+# COMMAND ----------
+import sys
+sys.path.insert(0, "/Workspace/Repos/your-repo/de-assistant")  # adjust to your repo path
+from rag import DataEngineeringRAG
+from agent import DataEngineeringAgent, DEAgentPyFunc
+class DEKnowledgeAssistant(mlflow.pyfunc.PythonModel):
+    """
+    MLflow PyFunc wrapper that:
+    1. Loads the PDF → builds ChromaDB vectors on model load
+    2. Exposes a predict() method compatible with Databricks Model Serving
+    3. Supports chat history for multi-turn conversations
+    """
+    def load_context(self, context: mlflow.pyfunc.PythonModelContext):
+        """Called once when the model is loaded into serving."""
+        import os
+        pdf_path = context.artifacts.get("pdf_path", PDF_VOLUME_PATH)
+        groq_key = os.environ.get("GROQ_API_KEY", GROQ_API_KEY)
+        self.rag = DataEngineeringRAG(pdf_path=pdf_path, groq_api_key=groq_key)
+        self.rag.initialize()
+        self.agent = DataEngineeringAgent(rag=self.rag, groq_api_key=groq_key)
+        print("✅ DE Knowledge Agent loaded and ready")
+    def predict(
+        self,
+        context: mlflow.pyfunc.PythonModelContext,
+        model_input: pd.DataFrame,
+        params: dict = None,
+    ) -> pd.Series:
+        """
+        Input DataFrame columns:
+          - message (str): user question
+          - history (str, JSON): previous conversation turns
+        Returns: pd.Series of string responses
+        """
+        import json
+        def process_row(row):
+            history = []
+            if row.get("history"):
+                try:
+                    history = json.loads(row["history"])
+                except Exception:
+                    history = []
+            return self.agent.invoke(message=row["message"], history=history)
+        return model_input.apply(process_row, axis=1)
+# COMMAND ----------
+# MAGIC %md ## 3. Log the model to MLflow
+# COMMAND ----------
+mlflow.set_experiment(EXPERIMENT_NAME)
+# Example input/output for signature inference
+sample_input = pd.DataFrame([{
+    "message": "What is the Medallion architecture?",
+    "history": "[]",
+}])
+with mlflow.start_run(run_name="de_knowledge_agent_v1") as run:
+    # Log hyperparameters
+    mlflow.log_params({
+        "llm_model": "llama-3.1-8b-instant",
+        "embedding_model": "all-MiniLM-L6-v2",
+        "chunk_size": 800,
+        "chunk_overlap": 160,
+        "retrieval_strategy": "mmr",
+        "top_k": 5,
+    })
+    # Infer signature from sample data
+    model = DEKnowledgeAssistant()
+    signature = infer_signature(
+        model_input=sample_input,
+        model_output=pd.Series(["Sample response from DE agent"]),
+    )
+    # Log the model
+    mlflow.pyfunc.log_model(
+        artifact_path="de_agent",
+        python_model=model,
+        artifacts={"pdf_path": PDF_VOLUME_PATH},
+        signature=signature,
+        pip_requirements=[
+            "groq>=0.9.0",
+            "langchain>=0.2.0",
+            "langchain-community>=0.2.0",
+            "chromadb>=0.5.0",
+            "sentence-transformers>=3.0.0",
+            "pypdf>=4.0.0",
+            "fastapi>=0.111.0",
+            "uvicorn>=0.30.0",
+        ],
+        registered_model_name=MODEL_NAME,
+    )
+    print(f"✅ Model logged — Run ID: {run.info.run_id}")
+# COMMAND ----------
+# MAGIC %md ## 4. Register and deploy to Model Serving
+# COMMAND ----------
+from mlflow.tracking import MlflowClient
+client = MlflowClient()
+# Get the latest version
+latest = client.get_latest_versions(MODEL_NAME, stages=["None"])[0]
+version = latest.version
+print(f"Latest model version: {version}")
+# Transition to Production
+client.transition_model_version_stage(
+    name=MODEL_NAME,
+    version=version,
+    stage="Production",
+    archive_existing_versions=True,
+)
+print(f"✅ Model v{version} promoted to Production")
+# COMMAND ----------
+# MAGIC %md
+# MAGIC ## 5. Create a Databricks Model Serving endpoint
+# MAGIC
+# MAGIC Run this via the Databricks SDK or UI:
+# MAGIC
+# MAGIC **UI path**: Machine Learning → Serving → Create Serving Endpoint
+# MAGIC   - Name: `de-knowledge-assistant`
+# MAGIC   - Model: `de_knowledge_agent` (Production)
+# MAGIC   - Compute: Small (CPU) — sufficient for this workload
+# MAGIC   - Environment variables: `GROQ_API_KEY` = your Groq key
+# COMMAND ----------
+# MAGIC # (Optional) SDK deployment
+try:
+    from databricks.sdk import WorkspaceClient
+    from databricks.sdk.service.serving import (
+        EndpointCoreConfigInput,
+        ServedModelInput,
+        ServedModelInputWorkloadSize,
+    )
+    w = WorkspaceClient()
+    endpoint_config = EndpointCoreConfigInput(
+        name="de-knowledge-assistant",
+        served_models=[
+            ServedModelInput(
+                model_name=MODEL_NAME,
+                model_version=str(version),
+                workload_size=ServedModelInputWorkloadSize.SMALL,
+                scale_to_zero_enabled=True,   # cost-saving: scale down when idle
+                environment_vars={"GROQ_API_KEY": "{{secrets/de-assistant/groq-api-key}}"},
+            )
+        ],
+    )
+    w.serving_endpoints.create(config=endpoint_config)
+    print("✅ Serving endpoint created — check Databricks UI for status")
+except ImportError:
+    print("databricks-sdk not installed — create the endpoint via Databricks UI instead")
+# COMMAND ----------
+# MAGIC %md
+# MAGIC ## 6. Test the endpoint
+# COMMAND ----------
+import requests
+import json
+ENDPOINT_URL = "https://<your-workspace>.azuredatabricks.net/serving-endpoints/de-knowledge-assistant/invocations"
+TOKEN        = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
+test_payload = {
+    "dataframe_records": [
+        {
+            "message": "Explain the Medallion architecture and give a PySpark example",
+            "history": "[]",
+        }
+    ]
+}
+response = requests.post(
+    ENDPOINT_URL,
+    headers={"Authorization": f"Bearer {TOKEN}", "Content-Type": "application/json"},
+    data=json.dumps(test_payload),
+    timeout=60,
+)
+print("Status:", response.status_code)
+print("Response:", response.json()["predictions"][0][:500])
+# COMMAND ----------
+# MAGIC %md
+# MAGIC ## 7. Connect the FastAPI PWA to your Databricks endpoint
+# MAGIC
+# MAGIC Update `app.py` → replace the Groq streaming call with the Databricks endpoint:
+# MAGIC
+# MAGIC ```python
+# MAGIC # In agent.py, add this alternative invoke method:
+# MAGIC def invoke_via_databricks(self, message: str, history: list) -> str:
+# MAGIC     import requests, json
+# MAGIC     payload = {"dataframe_records": [{"message": message, "history": json.dumps(history)}]}
+# MAGIC     r = requests.post(
+# MAGIC         os.environ["DATABRICKS_ENDPOINT_URL"],
+# MAGIC         headers={"Authorization": f"Bearer {os.environ['DATABRICKS_TOKEN']}"},
+# MAGIC         json=payload, timeout=30,
+# MAGIC     )
+# MAGIC     return r.json()["predictions"][0]
+# MAGIC ```
+# MAGIC
+# MAGIC Set `DATABRICKS_ENDPOINT_URL` and `DATABRICKS_TOKEN` in your Hugging Face Spaces secrets.

app.py ADDED Viewed

	@@ -0,0 +1,190 @@

+"""
+Data Engineering Knowledge Assistant — FastAPI Server
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Serves:
+  POST /api/chat        → streaming SSE chat
+  GET  /api/health      → readiness probe
+  POST /api/search      → raw vector search
+  *    /                → PWA frontend (static/)
+Deploy targets (all free):
+  • Local            : python app.py
+  • Hugging Face     : Set GROQ_API_KEY in Spaces secrets, port 7860
+  • Databricks       : See databricks/agent_notebook.py
+"""
+from __future__ import annotations
+import os
+import json
+from contextlib import asynccontextmanager
+from typing import List, Optional
+from fastapi import FastAPI, HTTPException, Query
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import StreamingResponse
+from fastapi.staticfiles import StaticFiles
+from pydantic import BaseModel
+# ──────────────────────────────────────────────────────────────────────────────
+# Global state
+# ──────────────────────────────────────────────────────────────────────────────
+rag_pipeline = None
+agent = None
+# ──────────────────────────────────────────────────────────────────────────────
+# Lifespan — init on startup
+# ──────────────────────────────────────────────────────────────────────────────
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    global rag_pipeline, agent
+    from rag import DataEngineeringRAG
+    from agent import DataEngineeringAgent
+    pdf_path = os.environ.get(
+        "PDF_PATH", "knowledge/data_engineering_patterns.pdf"
+    )
+    groq_key = os.environ.get("GROQ_API_KEY", "")
+    if not groq_key:
+        print(
+            "⚠️  GROQ_API_KEY not set — get a free key at https://console.groq.com"
+        )
+    print("🚀 Starting Data Engineering Knowledge Assistant …")
+    rag_pipeline = DataEngineeringRAG(pdf_path=pdf_path, groq_api_key=groq_key)
+    rag_pipeline.initialize()
+    agent = DataEngineeringAgent(rag=rag_pipeline, groq_api_key=groq_key)
+    print("✅ Agent ready — listening for requests")
+    yield
+    print("👋 Shutting down")
+# ──────────────────────────────────────────────────────────────────────────────
+# App
+# ──────────────────────────────────────────────────────────────────────────────
+app = FastAPI(
+    title="DE Knowledge Assistant",
+    description="Low-latency Databricks-style RAG agent for Data Engineering",
+    version="1.0.0",
+    lifespan=lifespan,
+)
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# ──────────────────────────────────────────────────────────────────────────────
+# Schemas
+# ──────────────────────────────────────────────────────────────────────────────
+class ChatMessage(BaseModel):
+    role: str         # "user" | "assistant"
+    content: str
+class ChatRequest(BaseModel):
+    message: str
+    history: Optional[List[ChatMessage]] = []
+    stream: bool = True
+class SearchRequest(BaseModel):
+    query: str
+    k: int = 5
+# ──────────────────────────────────────────────────────────────────────────────
+# Routes
+# ──────────────────────────────────────────────────────────────────────────────
+@app.get("/api/health")
+async def health():
+    return {
+        "status": "healthy",
+        "model": "llama-3.1-8b-instant (Groq)",
+        "vectorstore_docs": rag_pipeline.get_doc_count() if rag_pipeline else 0,
+        "agent_type": "Databricks-compatible MLflow Agent",
+        "version": "1.0.0",
+    }
+@app.post("/api/chat")
+async def chat(req: ChatRequest):
+    """
+    Chat endpoint.
+    • stream=true  → Server-Sent Events (SSE) — lowest perceived latency
+    • stream=false → JSON response (simpler, for testing)
+    """
+    if not agent:
+        raise HTTPException(503, "Agent not initialised — check server logs")
+    history = [m.model_dump() for m in req.history]
+    if req.stream:
+        async def event_stream():
+            try:
+                async for chunk in agent.astream(message=req.message, history=history):
+                    payload = json.dumps({"chunk": chunk})
+                    yield f"data: {payload}\n\n"
+                yield "data: [DONE]\n\n"
+            except Exception as exc:
+                err = json.dumps({"error": str(exc)})
+                yield f"data: {err}\n\n"
+        return StreamingResponse(
+            event_stream(),
+            media_type="text/event-stream",
+            headers={
+                "Cache-Control": "no-cache",
+                "X-Accel-Buffering": "no",   # disable nginx buffering
+            },
+        )
+    else:
+        response = agent.invoke(message=req.message, history=history)
+        return {"response": response}
+@app.post("/api/search")
+async def search(req: SearchRequest):
+    """Raw semantic search — useful for debugging retrieval quality."""
+    if not rag_pipeline:
+        raise HTTPException(503, "RAG not initialised")
+    results = rag_pipeline.search(req.query, k=req.k)
+    return {"query": req.query, "results": results}
+# ──────────────────────────────────────────────────────────────────────────────
+# Static frontend — mount LAST so API routes take priority
+# ──────────────────────────────────────────────────────────────────────────────
+app.mount("/", StaticFiles(directory="static", html=True), name="static")
+# ──────────────────────────────────────────────────────────────────────────────
+# Entry point
+# ──────────────────────────────────────────────────────────────────────────────
+if __name__ == "__main__":
+    import uvicorn
+    port = int(os.environ.get("PORT", 7860))   # 7860 = HuggingFace Spaces default
+    uvicorn.run(
+        "app:app",
+        host="0.0.0.0",
+        port=port,
+        reload=False,
+        log_level="info",
+    )

data_engineering_patterns.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bfe5db39e0d8edc192a683ec92953e95605edf621595be85f4f5e105c80423e8
+size 8372689

index.html ADDED Viewed

	@@ -0,0 +1,555 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8" />
+  <meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover" />
+  <meta name="apple-mobile-web-app-capable" content="yes" />
+  <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
+  <meta name="apple-mobile-web-app-title" content="DE Assistant" />
+  <meta name="theme-color" content="#1a1a2e" />
+  <link rel="manifest" href="/manifest.json" />
+  <link rel="apple-touch-icon" href="/icon-192.png" />
+  <title>DE Knowledge Assistant</title>
+  <style>
+    /* ── Reset & tokens ─────────────────────────────────────── */
+    *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
+    :root {
+      --bg:        #0d0d1a;
+      --surface:   #1a1a2e;
+      --surface2:  #16213e;
+      --accent:    #e94560;
+      --accent2:   #0f3460;
+      --user-bg:   #0f3460;
+      --bot-bg:    #1a1a2e;
+      --text:      #e0e0e0;
+      --text-dim:  #888;
+      --border:    #2a2a4a;
+      --green:     #00c896;
+      --radius:    18px;
+      --safe-top:  env(safe-area-inset-top, 0px);
+      --safe-bot:  env(safe-area-inset-bottom, 0px);
+    }
+    html, body { height: 100%; font-family: -apple-system, BlinkMacSystemFont,
+      "Segoe UI", Roboto, sans-serif; background: var(--bg); color: var(--text); }
+    /* ── Layout ─────────────────────────────────────────────── */
+    #app { display: flex; flex-direction: column; height: 100dvh;
+           padding-top: var(--safe-top); }
+    /* ── Header ─────────────────────────────────────────────── */
+    header {
+      display: flex; align-items: center; gap: 12px;
+      padding: 14px 18px;
+      background: var(--surface);
+      border-bottom: 1px solid var(--border);
+      backdrop-filter: blur(20px);
+      -webkit-backdrop-filter: blur(20px);
+    }
+    .logo { width: 36px; height: 36px; border-radius: 10px;
+            background: linear-gradient(135deg, var(--accent), var(--accent2));
+            display: grid; place-items: center; font-size: 18px; flex-shrink: 0; }
+    .header-text h1 { font-size: 15px; font-weight: 700; }
+    .header-text p  { font-size: 11px; color: var(--text-dim); }
+    #status-dot { width: 8px; height: 8px; border-radius: 50%;
+                  background: #555; margin-left: auto; flex-shrink: 0; transition: background .3s; }
+    #status-dot.ready   { background: var(--green); box-shadow: 0 0 6px var(--green); }
+    #status-dot.loading { background: #f5a623; animation: pulse 1s infinite; }
+    @keyframes pulse { 0%,100%{opacity:1} 50%{opacity:.4} }
+    /* ── Messages ───────────────────────────────────────────── */
+    #messages { flex: 1; overflow-y: auto; padding: 16px 14px;
+                scroll-behavior: smooth; }
+    #messages::-webkit-scrollbar { width: 4px; }
+    #messages::-webkit-scrollbar-track { background: transparent; }
+    #messages::-webkit-scrollbar-thumb { background: var(--border); border-radius: 2px; }
+    .msg { display: flex; gap: 10px; margin-bottom: 16px; animation: fadeIn .25s ease; }
+    @keyframes fadeIn { from{opacity:0;transform:translateY(8px)} to{opacity:1;transform:none} }
+    .msg.user  { flex-direction: row-reverse; }
+    .msg.bot   { flex-direction: row; }
+    .avatar { width: 32px; height: 32px; border-radius: 50%; display: grid;
+              place-items: center; font-size: 14px; flex-shrink: 0; align-self: flex-end; }
+    .msg.user .avatar { background: var(--accent); }
+    .msg.bot  .avatar { background: var(--accent2); }
+    .bubble { max-width: min(75vw, 520px); padding: 12px 16px;
+              border-radius: var(--radius); line-height: 1.6; font-size: 14px;
+              word-wrap: break-word; }
+    .msg.user .bubble { background: var(--user-bg); border-bottom-right-radius: 4px; }
+    .msg.bot  .bubble { background: var(--bot-bg); border: 1px solid var(--border);
+                        border-bottom-left-radius: 4px; }
+    /* markdown-ish inside bubbles */
+    .bubble code { background: rgba(255,255,255,.08); padding: 2px 6px;
+                   border-radius: 4px; font-family: "SF Mono", Menlo, monospace;
+                   font-size: 12px; }
+    .bubble pre  { background: #0a0a1a; border: 1px solid var(--border);
+                   border-radius: 10px; padding: 12px; overflow-x: auto;
+                   margin: 8px 0; }
+    .bubble pre code { background: none; padding: 0; font-size: 12px; }
+    .bubble strong { color: #fff; }
+    .bubble p  { margin-bottom: 8px; }
+    .bubble p:last-child { margin-bottom: 0; }
+    .bubble ul { padding-left: 18px; }
+    .bubble li { margin-bottom: 4px; }
+    .bubble blockquote { border-left: 3px solid var(--accent);
+                         padding-left: 12px; color: var(--text-dim); margin: 8px 0; }
+    /* sources badge */
+    .sources { margin-top: 10px; display: flex; flex-wrap: wrap; gap: 6px; }
+    .source-chip { font-size: 10px; background: rgba(15,52,96,.6);
+                   border: 1px solid var(--accent2); border-radius: 20px;
+                   padding: 3px 9px; color: #a0b4d0; }
+    /* typing indicator */
+    .typing-dots span { display: inline-block; width: 6px; height: 6px;
+                         border-radius: 50%; background: var(--text-dim);
+                         margin: 0 2px; animation: bounce .9s infinite; }
+    .typing-dots span:nth-child(2) { animation-delay: .15s; }
+    .typing-dots span:nth-child(3) { animation-delay: .3s;  }
+    @keyframes bounce { 0%,60%,100%{transform:translateY(0)} 30%{transform:translateY(-6px)} }
+    /* ── Welcome card ───────────────────────────────────────── */
+    #welcome { text-align: center; padding: 40px 20px; }
+    #welcome .icon { font-size: 52px; margin-bottom: 12px; }
+    #welcome h2 { font-size: 20px; margin-bottom: 8px; }
+    #welcome p  { color: var(--text-dim); font-size: 13px; max-width: 280px; margin: 0 auto 20px; }
+    .starters { display: flex; flex-direction: column; gap: 8px; max-width: 340px; margin: 0 auto; }
+    .starter-btn { background: var(--surface); border: 1px solid var(--border);
+                   border-radius: 12px; padding: 11px 16px; color: var(--text);
+                   font-size: 13px; cursor: pointer; text-align: left; transition: border-color .2s, background .2s; }
+    .starter-btn:hover { border-color: var(--accent); background: var(--surface2); }
+    /* ── Input bar ──────────────────────────────────────────── */
+    #input-bar {
+      display: flex; align-items: flex-end; gap: 8px;
+      padding: 10px 14px calc(10px + var(--safe-bot));
+      background: var(--surface);
+      border-top: 1px solid var(--border);
+    }
+    #msg-input {
+      flex: 1; background: var(--surface2); border: 1px solid var(--border);
+      border-radius: 22px; padding: 10px 16px; color: var(--text);
+      font-size: 15px; resize: none; min-height: 44px; max-height: 120px;
+      line-height: 1.4; outline: none; font-family: inherit;
+      transition: border-color .2s;
+    }
+    #msg-input:focus { border-color: var(--accent); }
+    #msg-input::placeholder { color: var(--text-dim); }
+    .icon-btn {
+      width: 44px; height: 44px; border-radius: 50%; border: none;
+      cursor: pointer; display: grid; place-items: center;
+      font-size: 18px; flex-shrink: 0; transition: transform .15s, background .2s;
+    }
+    .icon-btn:active { transform: scale(.9); }
+    #send-btn { background: var(--accent); }
+    #send-btn:disabled { background: #555; cursor: not-allowed; }
+    #voice-btn { background: var(--surface2); border: 1px solid var(--border); }
+    #voice-btn.recording {
+      background: var(--accent) !important;
+      animation: ripple 1.2s ease-out infinite;
+    }
+    @keyframes ripple {
+      0%  { box-shadow: 0 0 0 0 rgba(233,69,96,.6); }
+      70% { box-shadow: 0 0 0 16px rgba(233,69,96,0); }
+      100%{ box-shadow: 0 0 0 0 rgba(233,69,96,0); }
+    }
+    /* ── Voice overlay ──────────────────────────────────────── */
+    #voice-overlay {
+      display: none; position: fixed; inset: 0; background: rgba(0,0,0,.8);
+      backdrop-filter: blur(10px); flex-direction: column;
+      align-items: center; justify-content: center; gap: 20px; z-index: 100;
+    }
+    #voice-overlay.active { display: flex; }
+    #voice-wave { font-size: 64px; animation: pulse 1s infinite; }
+    #voice-transcript { color: #ddd; font-size: 16px; max-width: 300px;
+                        text-align: center; min-height: 40px; }
+    #voice-cancel { background: var(--surface); border: 1px solid var(--border);
+                    color: var(--text); border-radius: 50px; padding: 12px 28px;
+                    font-size: 15px; cursor: pointer; }
+    /* ── Toast ──────────────────────────────────────────────── */
+    #toast { position: fixed; bottom: calc(90px + var(--safe-bot)); left: 50%;
+             transform: translateX(-50%) translateY(20px);
+             background: #222; border: 1px solid var(--border); border-radius: 20px;
+             padding: 8px 18px; font-size: 13px; opacity: 0; pointer-events: none;
+             transition: opacity .3s, transform .3s; white-space: nowrap; z-index: 50; }
+    #toast.show { opacity: 1; transform: translateX(-50%) translateY(0); }
+  </style>
+</head>
+<body>
+<div id="app">
+  <!-- Header -->
+  <header>
+    <div class="logo">🗄️</div>
+    <div class="header-text">
+      <h1>DE Knowledge Assistant</h1>
+      <p id="header-sub">Connecting…</p>
+    </div>
+    <div id="status-dot" title="Agent status"></div>
+  </header>
+  <!-- Messages -->
+  <div id="messages">
+    <div id="welcome">
+      <div class="icon">⚡</div>
+      <h2>Ask me anything about<br>Data Engineering</h2>
+      <p>Powered by "Data Engineering Design Patterns" + Groq's ultra-fast inference</p>
+      <div class="starters">
+        <button class="starter-btn" onclick="sendStarter(this)">What is the Medallion architecture and when should I use it?</button>
+        <button class="starter-btn" onclick="sendStarter(this)">Show me a PySpark CDC (Change Data Capture) example</button>
+        <button class="starter-btn" onclick="sendStarter(this)">Compare Lambda vs Kappa architecture</button>
+        <button class="starter-btn" onclick="sendStarter(this)">How do I handle late-arriving data in streaming pipelines?</button>
+      </div>
+    </div>
+  </div>
+  <!-- Input bar -->
+  <div id="input-bar">
+    <button id="voice-btn" class="icon-btn" onclick="toggleVoice()" title="Voice input">🎤</button>
+    <textarea id="msg-input" rows="1" placeholder="Ask a data engineering question…"
+      onkeydown="handleKey(event)" oninput="autoResize(this)"></textarea>
+    <button id="send-btn" class="icon-btn" onclick="sendMessage()" title="Send">➤</button>
+  </div>
+</div>
+<!-- Voice overlay -->
+<div id="voice-overlay">
+  <div id="voice-wave">🎙️</div>
+  <div id="voice-transcript">Listening…</div>
+  <button id="voice-cancel" onclick="stopVoice()">✕ Cancel</button>
+</div>
+<!-- Toast -->
+<div id="toast"></div>
+<script>
+// ── State ────────────────────────────────────────────────────────────────────
+const state = {
+  history: [],
+  isLoading: false,
+  recognition: null,
+  synthesis: window.speechSynthesis || null,
+  voiceActive: false,
+  currentUtterance: null,
+};
+const BASE_URL = window.location.origin;
+// ── DOM helpers ──────────────────────────────────────────────────────────────
+const $ = id => document.getElementById(id);
+const msgInput    = $('msg-input');
+const messages    = $('messages');
+const sendBtn     = $('send-btn');
+const voiceBtn    = $('voice-btn');
+const statusDot   = $('status-dot');
+const headerSub   = $('header-sub');
+const welcome     = $('welcome');
+const voiceOverlay = $('voice-overlay');
+const voiceTranscript = $('voice-transcript');
+// ── Health check ─────────────────────────────────────────────────────────────
+async function checkHealth() {
+  try {
+    const r = await fetch(`${BASE_URL}/api/health`);
+    const data = await r.json();
+    statusDot.className = 'ready';
+    headerSub.textContent = `${data.vectorstore_docs.toLocaleString()} chunks · Groq`;
+  } catch {
+    statusDot.className = 'loading';
+    headerSub.textContent = 'Agent initialising…';
+    setTimeout(checkHealth, 3000);
+  }
+}
+checkHealth();
+// ── Markdown renderer (no deps) ───────────────────────────────────────────────
+function renderMarkdown(text) {
+  return text
+    // code blocks
+    .replace(/```(\w*)\n?([\s\S]*?)```/g, (_, lang, code) =>
+      `<pre><code class="lang-${lang}">${escHtml(code.trim())}</code></pre>`)
+    // inline code
+    .replace(/`([^`]+)`/g, (_, c) => `<code>${escHtml(c)}</code>`)
+    // bold
+    .replace(/\*\*(.+?)\*\*/g, '<strong>$1</strong>')
+    // italic
+    .replace(/\*(.+?)\*/g, '<em>$1</em>')
+    // headings
+    .replace(/^### (.+)$/gm, '<p><strong>$1</strong></p>')
+    .replace(/^## (.+)$/gm,  '<p><strong>$1</strong></p>')
+    // bullets
+    .replace(/^[-•] (.+)$/gm, '<li>$1</li>')
+    .replace(/(<li>[\s\S]+?<\/li>)/g, '<ul>$1</ul>')
+    // numbered lists
+    .replace(/^\d+\. (.+)$/gm, '<li>$1</li>')
+    // emoji-prefixed tips
+    .replace(/^(💡.+)$/gm, '<blockquote>$1</blockquote>')
+    // newlines → paragraphs
+    .replace(/\n\n+/g, '</p><p>')
+    .replace(/^(?!<)(.+)/gm, '$1')
+    .replace(/^<p><\/p>$|^<\/p><p>$/gm, '')
+    .trim();
+}
+function escHtml(s) {
+  return s.replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;');
+}
+// ── Message rendering ─────────────────────────────────────────────────────────
+function appendMessage(role, content, streaming = false) {
+  if (welcome.style.display !== 'none' || document.getElementById('welcome')) {
+    if ($('welcome')) $('welcome').remove();
+  }
+  const wrap = document.createElement('div');
+  wrap.className = `msg ${role}`;
+  wrap.innerHTML = `
+    <div class="avatar">${role === 'user' ? '🧑' : '🤖'}</div>
+    <div class="bubble" id="bubble-${Date.now()}">${
+      role === 'user' ? escHtml(content) : renderMarkdown(content)
+    }</div>`;
+  messages.appendChild(wrap);
+  messages.scrollTop = messages.scrollHeight;
+  return wrap.querySelector('.bubble');
+}
+function showTyping() {
+  if ($('welcome')) $('welcome').remove();
+  const wrap = document.createElement('div');
+  wrap.className = 'msg bot';
+  wrap.id = 'typing-indicator';
+  wrap.innerHTML = `
+    <div class="avatar">🤖</div>
+    <div class="bubble"><div class="typing-dots"><span></span><span></span><span></span></div></div>`;
+  messages.appendChild(wrap);
+  messages.scrollTop = messages.scrollHeight;
+}
+function removeTyping() { $('typing-indicator')?.remove(); }
+// ── Send flow ─────────────────────────────────────────────────────────────────
+async function sendMessage(text) {
+  const msg = (text || msgInput.value).trim();
+  if (!msg || state.isLoading) return;
+  msgInput.value = '';
+  autoResize(msgInput);
+  state.isLoading = true;
+  sendBtn.disabled = true;
+  statusDot.className = 'loading';
+  appendMessage('user', msg);
+  showTyping();
+  // Stop any ongoing TTS
+  state.synthesis?.cancel();
+  try {
+    const res = await fetch(`${BASE_URL}/api/chat`, {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify({ message: msg, history: state.history, stream: true }),
+    });
+    removeTyping();
+    if (!res.ok) throw new Error(`Server error ${res.status}`);
+    const bubble = appendMessage('bot', '');
+    let full = '';
+    // Server-Sent Events streaming
+    const reader = res.body.getReader();
+    const decoder = new TextDecoder();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop();
+      for (const line of lines) {
+        if (!line.startsWith('data: ')) continue;
+        const payload = line.slice(6).trim();
+        if (payload === '[DONE]') break;
+        try {
+          const { chunk } = JSON.parse(payload);
+          if (chunk) {
+            full += chunk;
+            bubble.innerHTML = renderMarkdown(full);
+            messages.scrollTop = messages.scrollHeight;
+          }
+        } catch { /* skip malformed */ }
+      }
+    }
+    // Update history
+    state.history.push({ role: 'user', content: msg });
+    state.history.push({ role: 'assistant', content: full });
+    if (state.history.length > 12) state.history = state.history.slice(-12);
+    // Auto-speak response if voice mode was used
+    if (state.voiceActive && state.synthesis) {
+      speakText(stripMarkdown(full));
+    }
+  } catch (err) {
+    removeTyping();
+    appendMessage('bot', `⚠️ Error: ${err.message}. Check that the server is running and GROQ_API_KEY is set.`);
+  } finally {
+    state.isLoading = false;
+    sendBtn.disabled = false;
+    statusDot.className = 'ready';
+  }
+}
+function sendStarter(btn) { sendMessage(btn.textContent.trim()); }
+// ── Text-to-Speech ────────────────────────────────────────────────────────────
+function speakText(text) {
+  if (!state.synthesis) return;
+  state.synthesis.cancel();
+  const utt = new SpeechSynthesisUtterance(text.slice(0, 800)); // limit TTS length
+  utt.rate = 1.05;
+  utt.pitch = 1;
+  // Prefer a natural English voice
+  const voices = state.synthesis.getVoices();
+  const preferred = voices.find(v => v.lang.startsWith('en') && v.localService)
+                 || voices.find(v => v.lang.startsWith('en'))
+                 || voices[0];
+  if (preferred) utt.voice = preferred;
+  state.synthesis.speak(utt);
+}
+function stripMarkdown(text) {
+  return text.replace(/```[\s\S]*?```/g, 'code block')
+             .replace(/`([^`]+)`/g, '$1')
+             .replace(/\*\*(.+?)\*\*/g, '$1')
+             .replace(/\*(.+?)\*/g, '$1')
+             .replace(/^#+\s/gm, '')
+             .replace(/^[-•] /gm, '')
+             .replace(/💡/g, 'Pro tip:');
+}
+// ── Voice Input ───────────────────────────────────────────────────────────────
+function setupSpeechRecognition() {
+  const SR = window.SpeechRecognition || window.webkitSpeechRecognition;
+  if (!SR) return null;
+  const rec = new SR();
+  rec.continuous = false;
+  rec.interimResults = true;
+  rec.lang = 'en-US';
+  rec.onstart = () => {
+    voiceOverlay.classList.add('active');
+    voiceBtn.classList.add('recording');
+    voiceTranscript.textContent = 'Listening…';
+  };
+  rec.onresult = e => {
+    let interim = '', final = '';
+    for (let r of e.results) {
+      if (r.isFinal) final += r[0].transcript;
+      else interim += r[0].transcript;
+    }
+    voiceTranscript.textContent = final || interim || 'Listening…';
+    if (final) msgInput.value = final;
+  };
+  rec.onerror = err => {
+    stopVoice();
+    if (err.error === 'not-allowed') toast('🎤 Microphone permission denied');
+    else toast(`Voice error: ${err.error}`);
+  };
+  rec.onend = () => {
+    voiceOverlay.classList.remove('active');
+    voiceBtn.classList.remove('recording');
+    state.voiceActive = false;
+    const text = msgInput.value.trim();
+    if (text) sendMessage(text);
+  };
+  return rec;
+}
+function toggleVoice() {
+  if (state.voiceActive) { stopVoice(); return; }
+  if (!state.recognition) {
+    state.recognition = setupSpeechRecognition();
+  }
+  if (!state.recognition) {
+    toast('🎤 Voice not supported in this browser. Try Chrome on Android or Safari on iOS 14.5+.');
+    return;
+  }
+  state.voiceActive = true;
+  try { state.recognition.start(); }
+  catch { state.voiceActive = false; toast('Could not start microphone'); }
+}
+function stopVoice() {
+  state.voiceActive = false;
+  state.recognition?.abort();
+  voiceOverlay.classList.remove('active');
+  voiceBtn.classList.remove('recording');
+}
+// ── UI helpers ────────────────────────────────────────────────────────────────
+function autoResize(el) {
+  el.style.height = 'auto';
+  el.style.height = Math.min(el.scrollHeight, 120) + 'px';
+}
+function handleKey(e) {
+  if (e.key === 'Enter' && !e.shiftKey) {
+    e.preventDefault();
+    sendMessage();
+  }
+}
+function toast(msg, duration = 3000) {
+  const t = $('toast');
+  t.textContent = msg;
+  t.classList.add('show');
+  setTimeout(() => t.classList.remove('show'), duration);
+}
+// ── PWA install banner ────────────────────────────────────────────────────────
+let deferredPrompt;
+window.addEventListener('beforeinstallprompt', e => {
+  e.preventDefault();
+  deferredPrompt = e;
+  setTimeout(() => {
+    toast('📲 Add to Home Screen for the best experience!', 5000);
+  }, 3000);
+});
+// ── Service worker ────────────────────────────────────────────────────────────
+if ('serviceWorker' in navigator) {
+  navigator.serviceWorker.register('/sw.js').catch(() => {});
+}
+// Load voices async (required for some browsers)
+if (state.synthesis) {
+  state.synthesis.onvoiceschanged = () => state.synthesis.getVoices();
+  state.synthesis.getVoices();
+}
+</script>
+</body>
+</html>

manifest.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "name": "DE Knowledge Assistant",
+  "short_name": "DE Assistant",
+  "description": "Low-latency AI agent for Data Engineering Design Patterns — voice-enabled, Databricks-compatible",
+  "start_url": "/",
+  "display": "standalone",
+  "background_color": "#0d0d1a",
+  "theme_color": "#1a1a2e",
+  "orientation": "portrait-primary",
+  "scope": "/",
+  "lang": "en",
+  "categories": ["productivity", "education", "developer tools"],
+  "icons": [
+    {
+      "src": "data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 192 192'><rect width='192' height='192' rx='40' fill='%231a1a2e'/><text x='50%25' y='55%25' font-size='100' text-anchor='middle' dominant-baseline='middle'>🗄️</text></svg>",
+      "sizes": "192x192",
+      "type": "image/svg+xml",
+      "purpose": "any maskable"
+    },
+    {
+      "src": "data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 512 512'><rect width='512' height='512' rx='100' fill='%231a1a2e'/><text x='50%25' y='55%25' font-size='280' text-anchor='middle' dominant-baseline='middle'>🗄️</text></svg>",
+      "sizes": "512x512",
+      "type": "image/svg+xml",
+      "purpose": "any maskable"
+    }
+  ],
+  "screenshots": [],
+  "shortcuts": [
+    {
+      "name": "Ask a Question",
+      "short_name": "Ask",
+      "description": "Open the chat interface",
+      "url": "/"
+    }
+  ],
+  "prefer_related_applications": false
+}

rag.py ADDED Viewed

	@@ -0,0 +1,170 @@

+"""
+RAG Pipeline — Data Engineering Knowledge Assistant
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Strategy : PDF → chunked → HuggingFace MiniLM embeddings → ChromaDB (in-memory)
+LLM      : Groq llama-3.1-8b-instant  (sub-500ms response, free tier)
+Compat   : Works standalone OR registered as an MLflow PyFunc model on Databricks
+"""
+from __future__ import annotations
+import os
+from pathlib import Path
+from typing import List, Dict
+# ──────────────────────────────────────────────────────────────────────────────
+# Core RAG class
+# ──────────────────────────────────────────────────────────────────────────────
+class DataEngineeringRAG:
+    """
+    Retrieval-Augmented Generation pipeline tuned for data-engineering content.
+    Usage (standalone):
+        rag = DataEngineeringRAG(pdf_path="knowledge/data_engineering_patterns.pdf",
+                                 groq_api_key=os.environ["GROQ_API_KEY"])
+        rag.initialize()
+        print(rag.search("What is the Lambda architecture?"))
+    Usage (Databricks):
+        Register via mlflow.pyfunc.log_model — see databricks/agent_notebook.py
+    """
+    def __init__(self, pdf_path: str, groq_api_key: str):
+        self.pdf_path = Path(pdf_path)
+        self.groq_api_key = groq_api_key
+        self.vectorstore = None
+        self.retriever = None
+        self._doc_count = 0
+        self._initialized = False
+    # ── public ────────────────────────────────────────────────────────────────
+    def initialize(self) -> None:
+        """Load PDF → embed → store. Safe to call multiple times (idempotent)."""
+        if self._initialized:
+            return
+        if not self.pdf_path.exists():
+            print(f"⚠️  PDF not found at '{self.pdf_path}' — running in demo mode.")
+            self._demo_mode()
+            return
+        self._build_vectorstore()
+        self._initialized = True
+    def search(self, query: str, k: int = 5) -> List[Dict]:
+        """Return ranked chunks relevant to *query*."""
+        if not self.vectorstore:
+            return []
+        docs_scores = self.vectorstore.similarity_search_with_score(query, k=k)
+        return [
+            {
+                "content": doc.page_content,
+                "source": doc.metadata.get("source", "pdf"),
+                "page": doc.metadata.get("page", 0),
+                "score": round(1 - float(score), 4),   # convert distance → similarity
+            }
+            for doc, score in docs_scores
+        ]
+    def get_retriever(self):
+        return self.retriever
+    def get_doc_count(self) -> int:
+        return self._doc_count
+    # ── private ───────────────────────────────────────────────────────────────
+    def _build_vectorstore(self) -> None:
+        from langchain_community.document_loaders import PyPDFLoader
+        from langchain.text_splitter import RecursiveCharacterTextSplitter
+        from langchain_community.vectorstores import Chroma
+        from langchain_community.embeddings import HuggingFaceEmbeddings
+        print(f"📚 Loading '{self.pdf_path.name}' …")
+        loader = PyPDFLoader(str(self.pdf_path))
+        documents = loader.load()
+        print(f"   → {len(documents)} pages loaded")
+        # ── Chunk ──────────────────────────────────────────────────────────
+        # Smaller chunks (800 chars) with generous overlap keep context intact
+        # for technical patterns that often span several paragraphs.
+        splitter = RecursiveCharacterTextSplitter(
+            chunk_size=800,
+            chunk_overlap=160,
+            separators=["\n\n", "\n", ". ", "! ", "? ", ", ", " "],
+        )
+        chunks = splitter.split_documents(documents)
+        print(f"   → {len(chunks)} chunks created")
+        # ── Embed ──────────────────────────────────────────────────────────
+        # all-MiniLM-L6-v2 : 22 MB, CPU-friendly, strong semantic accuracy
+        print("🔢 Embedding chunks (CPU, ~30–60 s on first run) …")
+        embeddings = HuggingFaceEmbeddings(
+            model_name="sentence-transformers/all-MiniLM-L6-v2",
+            model_kwargs={"device": "cpu"},
+            encode_kwargs={"normalize_embeddings": True},
+        )
+        # ── Store ──────────────────────────────────────────────────────────
+        # Chroma in-memory — no disk I/O, works on HF Spaces free tier
+        self.vectorstore = Chroma.from_documents(
+            documents=chunks,
+            embedding=embeddings,
+            collection_name="de_patterns",
+        )
+        # MMR retriever: diversity + relevance
+        self.retriever = self.vectorstore.as_retriever(
+            search_type="mmr",
+            search_kwargs={"k": 5, "fetch_k": 20, "lambda_mult": 0.6},
+        )
+        self._doc_count = len(chunks)
+        print(f"✅ Vector store ready — {self._doc_count} chunks indexed")
+    def _demo_mode(self) -> None:
+        """Lightweight fallback when PDF is missing (useful for CI / testing)."""
+        from langchain_community.vectorstores import Chroma
+        from langchain_community.embeddings import HuggingFaceEmbeddings
+        from langchain.schema import Document
+        demo_docs = [
+            Document(
+                page_content=(
+                    "The Lambda Architecture splits processing into three layers: "
+                    "batch, speed, and serving. The batch layer reprocesses all historical "
+                    "data; the speed layer handles real-time incremental updates; the serving "
+                    "layer merges both for query."
+                ),
+                metadata={"source": "demo", "page": 0},
+            ),
+            Document(
+                page_content=(
+                    "The Kappa Architecture simplifies Lambda by removing the batch layer. "
+                    "All data flows through a single streaming path. Historical reprocessing "
+                    "is done by replaying the event log."
+                ),
+                metadata={"source": "demo", "page": 1},
+            ),
+            Document(
+                page_content=(
+                    "A Data Lakehouse combines the flexibility of a data lake with the "
+                    "structure and ACID guarantees of a data warehouse. Formats like Delta Lake, "
+                    "Apache Iceberg, and Apache Hudi implement this pattern."
+                ),
+                metadata={"source": "demo", "page": 2},
+            ),
+        ]
+        embeddings = HuggingFaceEmbeddings(
+            model_name="sentence-transformers/all-MiniLM-L6-v2",
+            model_kwargs={"device": "cpu"},
+        )
+        self.vectorstore = Chroma.from_documents(demo_docs, embedding=embeddings)
+        self.retriever = self.vectorstore.as_retriever(search_kwargs={"k": 3})
+        self._doc_count = len(demo_docs)
+        self._initialized = True
+        print("✅ Demo mode active — 3 built-in DE patterns loaded")

requirements.txt ADDED Viewed

	@@ -0,0 +1,31 @@

+# Data Engineering Knowledge Assistant — Dependencies
+# All open-source, all free tier compatible
+# ── Web framework ─────────────────────────────────────────────────────────────
+fastapi>=0.111.0
+uvicorn[standard]>=0.30.0
+python-multipart>=0.0.9
+# ── LLM (Groq — free tier, ultra-low latency) ────────────────────────────────
+groq>=0.9.0
+# ── RAG pipeline ──────────────────────────────────────────────────────────────
+langchain>=0.2.0
+langchain-community>=0.2.0
+pypdf>=4.0.0
+# ── Vector store ──────────────────────────────────────────────────────────────
+chromadb>=0.5.0
+# ── Embeddings (free, runs on CPU) ───────────────────────────────────────────
+sentence-transformers>=3.0.0
+torch>=2.0.0          # CPU-only — HF Spaces free tier doesn't need GPU
+transformers>=4.40.0
+# ── MLflow (Databricks compatibility) ────────────────────────────────────────
+mlflow>=2.12.0
+# ── Utilities ─────────────────────────────────────────────────────────────────
+python-dotenv>=1.0.0
+pydantic>=2.0.0
+httpx>=0.27.0         # async HTTP client for testing

setup.sh ADDED Viewed

	@@ -0,0 +1,56 @@

+#!/bin/bash
+# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+# DE Knowledge Assistant — One-command Local Setup
+# Usage: chmod +x setup.sh && ./setup.sh
+# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+set -e
+echo ""
+echo "🗄️  Data Engineering Knowledge Assistant Setup"
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo ""
+# ── 1. Python check ───────────────────────────────────────
+PYTHON=$(python3 --version 2>&1 | awk '{print $2}')
+echo "✓ Python $PYTHON found"
+# ── 2. Virtual environment ────────────────────────────────
+if [ ! -d ".venv" ]; then
+  echo "→ Creating virtual environment…"
+  python3 -m venv .venv
+fi
+source .venv/bin/activate
+echo "✓ Virtual environment activated"
+# ── 3. Install dependencies ───────────────────────────────
+echo "→ Installing dependencies (this takes ~2 min on first run)…"
+pip install -q --upgrade pip
+pip install -q -r requirements.txt
+echo "✓ Dependencies installed"
+# ── 4. Environment variables ──────────────────────────────
+if [ ! -f ".env" ]; then
+  cp .env.example .env
+  echo ""
+  echo "⚠️  ACTION REQUIRED:"
+  echo "   Edit .env and add your free Groq API key."
+  echo "   Get one at: https://console.groq.com  (takes 30 seconds)"
+  echo ""
+  if command -v open &>/dev/null; then open https://console.groq.com; fi
+  read -p "   Press Enter after you've added your GROQ_API_KEY to .env…" -r
+fi
+echo "✓ Environment configured"
+# ── 5. Start server ───────────────────────────────────────
+echo ""
+echo "🚀 Starting DE Knowledge Assistant…"
+echo "   First run will download the embedding model (~90 MB) and index the PDF."
+echo "   This takes about 60 seconds. Subsequent starts are instant."
+echo ""
+echo "   Open http://localhost:8000 in your browser"
+echo "   On iPhone: open Safari → http://your-local-ip:8000 → Share → Add to Home Screen"
+echo ""
+export $(grep -v '^#' .env | xargs)
+export PORT=8000
+python app.py

sw.js ADDED Viewed

	@@ -0,0 +1,52 @@

+/**
+ * Service Worker — DE Knowledge Assistant PWA
+ * Strategy: Cache-first for static assets, network-first for API calls
+ * This enables "Add to Home Screen" on iOS Safari and offline shell loading
+ */
+const CACHE = 'de-assistant-v1';
+const STATIC_ASSETS = ['/', '/index.html', '/manifest.json'];
+// ── Install ──────────────────────────────────────────────────────────────────
+self.addEventListener('install', event => {
+  event.waitUntil(
+    caches.open(CACHE).then(cache => cache.addAll(STATIC_ASSETS))
+  );
+  self.skipWaiting();
+});
+// ── Activate ─────────────────────────────────────────────────────────────────
+self.addEventListener('activate', event => {
+  event.waitUntil(
+    caches.keys().then(keys =>
+      Promise.all(keys.filter(k => k !== CACHE).map(k => caches.delete(k)))
+    )
+  );
+  self.clients.claim();
+});
+// ── Fetch ─────────────────────────────────────────────────────────────────────
+self.addEventListener('fetch', event => {
+  const { request } = event;
+  const url = new URL(request.url);
+  // API calls → always network (never cache LLM responses)
+  if (url.pathname.startsWith('/api/')) {
+    event.respondWith(fetch(request));
+    return;
+  }
+  // Static assets → cache-first, fall back to network
+  event.respondWith(
+    caches.match(request).then(cached => {
+      if (cached) return cached;
+      return fetch(request).then(response => {
+        if (response.ok) {
+          const clone = response.clone();
+          caches.open(CACHE).then(cache => cache.put(request, clone));
+        }
+        return response;
+      });
+    }).catch(() => caches.match('/index.html'))
+  );
+});