Spaces:

MCP-1st-Birthday
/

Atlas

Running

App Files Files Community

a-zamfir commited on 12 days ago

Commit

f26de06

1 Parent(s): d3bbc38

initial atlas commit

Browse files

Files changed (17) hide show

.env +5 -0
.gitignore +6 -0
README.md +133 -2
app.py +667 -0
config/prompts.py +198 -0
config/settings.py +154 -0
crm_mcp_server.py +525 -0
data/access.json +12 -0
data/customers.json +117 -0
data/deals.json +146 -0
data/documents/walnut integration roadmap.md +17 -0
data/documents/walnut meeting minutes.md +15 -0
requirements.txt +25 -0
services/audio_service.py +121 -0
services/llm_service.py +167 -0
services/mcp_client.py +24 -0
services/screen_service.py +115 -0

.env ADDED Viewed

	@@ -0,0 +1,5 @@

+# Auto-select best provider
+LLM_PROVIDER=huggingface
+AUDIO_PROVIDER=huggingface
+# Hugging Face and Nebius setup: add your keys here.

.gitignore ADDED Viewed

	@@ -0,0 +1,6 @@

+__pycache__
+*.pyc
+temp
+.venv
+auth_api
+auth_api.py

README.md CHANGED Viewed

@@ -8,7 +8,138 @@ sdk_version: 6.0.1
 app_file: app.py
 pinned: false
 license: apache-2.0
-short_description: Atlas is a general usage assistant
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 app_file: app.py
 pinned: false
 license: apache-2.0
+short_description: ATLAS - Gradio x HuggingFace Hackathon
 ---
+# ATLAS
+## Important
+1. **Watch** ATLAS' video overview here: [Youtube](https://youtu.be/-nn9mkU5jqk)]
+2. **ATLAS works entirely through mock MCP tools** - no external dependencies required. Just clone and run.
+## Overview
+ATLAS is a multimodal AI work companion built for the Gradio x MCP Hackathon. It demonstrates how a voice-driven assistant can augment knowledge work by:
+- **Listening** to your requests through voice (STT)
+- **Speaking** responses and updates (TTS)
+- **Seeing** your screen to understand context (vision)
+- **Acting** on your behalf through MCP tool integrations
+The goal is to showcase how modern LLMs can be integrated into daily workflows to handle context retrieval, document analysis, and environment automation, all through natural conversation.
+## Key Goals
+1. **Multimodal Work Companion**
+   - Voice: hands-free interaction during calls/meetings
+   - Vision: screen analysis for real-time context
+   - Text: conversational interface with persistent context
+2. **Practical Automation**
+   - Email context absorption
+   - Customer data retrieval
+   - Document lookup and analysis
+   - Environment automation (API permissions, integrations)
+3. **Proof-of-Concept (POC)**
+   - Simple RAG without database infrastructure
+   - Mock MCP tools for easy setup
+   - Adaptable to any office workflow
+## Functionalities & Offerings
+### 1. Audio Service
+- **STT**: Converts voice input to text for hands-free operation
+- **TTS**: Speaks AI responses for natural conversation flow
+### 2. Text (LLM) Service
+- Built on modern LLM APIs
+- Handles multi-turn conversation with context retention
+- Tool-calling orchestration for MCP integration
+- Dynamic prompt engineering for context-aware responses
+### 3. Vision Service
+- Screen capture analysis for understanding user context
+- Document reading and interpretation
+- Visual feedback integration into conversation flow
+### 4. MCP Integration
+- **Customer Data Tools**: Retrieve CRM information on demand
+- **Document Retrieval**: Simple RAG implementation without database
+- **Environment Automation**: API permission management, integration testing
+- **Email Processing**: Context absorption and response generation
+## Demo Scenario
+The hackathon demo showcases a realistic CSM/sales rep workflow:
+1. **Email arrives** → ATLAS reads and absorbs context using vision
+2. **Customer data needed** → Retrieves from mock CRM
+3. **Documents requested** → Pulls relevant customer files
+4. **API call fails (401)** → User encounters auth error in Postman
+5. **ATLAS fixes it** → Updates access permissions automatically
+6. **Verification** → API call succeeds
+7. **Response draft** → Generates email reply based on full context
+All through natural voice conversation.
+## Tech Stack
+| Component          | Technology                                       |
+|--------------------|--------------------------------------------------|
+| UI Framework       | Gradio 6                                         |
+| LLM                | HuggingFace/Nebius APIs                          |
+| STT                | Speech-to-text model: Whisper                    |
+| TTS                | Text-to-speech model: Kokoro                     |
+| Vision             | Vision language model: Gemma                     |
+| Tool Integration   | MCP (Model Context Protocol)                     |
+| RAG                | Simple document retrieval (no vector DB)         |
+## Quickstart
+1. **Install dependencies**:
+```bash
+   pip install -r requirements.txt
+```
+2. **Configure** `.env` with your API keys.
+3. **Launch** the Gradio app:
+```bash
+   python app.py
+```
+4. **Interact** by voice or text:
+   - Click "Record" to begin voice interaction
+   - Ask ATLAS to retrieve customer data, or pull documents
+   - Share screen for visual context
+   - Request environment automations (API permissions, etc.)
+## Adaptability
+While built for CSM/sales rep workflows, ATLAS adapts to any office role:
+- **Support Engineers**: Ticket context + documentation retrieval + environment automation
+- **Account Managers**: Client data + document analysis + meeting prep
+- **Project Managers**: Task context + resource lookup + status updates
+- **Developers**: API testing + documentation + environment management
+Simply swap the MCP tools to match your workflow.
+## Architecture
+ATLAS uses a simple but effective architecture:
+1. **Gradio UI** → User interaction layer (voice/text/vision)
+2. **LLM Core** → Reasoning and orchestration
+3. **MCP Tools** → Lightweight integrations (no heavy infra)
+4. **Simple RAG** → Document retrieval without vector databases
+Focus on clarity and practical value over architectural complexity.
+## Contact
+<a.zamfir@hotmail.com>
+LinkedIn: Andrei Zamfir <https://www.linkedin.com/in/andrei-d-zamfir/>

app.py ADDED Viewed

	@@ -0,0 +1,667 @@

+"""
+Atlas - Minimal VAD version based on Gradio's official pattern
+"""
+import gradio as gr
+import asyncio
+import logging
+import tempfile
+import numpy as np
+import wave
+import io
+import time
+import re
+import ast
+import json
+import os
+import sys
+import atexit
+import subprocess
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Optional, List, Dict, Tuple
+from services.mcp_client import MCPClient
+from services.audio_service import AudioService
+from services.llm_service import LLMService
+from services.screen_service import get_screen_service
+from config.settings import Settings
+from config.prompts import get_generic_prompt
+from openai import OpenAI
+logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
+logger = logging.getLogger(__name__)
+# ============================================
+# App State (like Gradio's official example)
+# ============================================
+@dataclass
+class AppState:
+    stream: Optional[np.ndarray] = None
+    sampling_rate: int = 0
+    pause_detected: bool = False
+    started_talking: bool = False
+    stopped: bool = False
+    conversation: List[Dict] = field(default_factory=list)
+# ============================================
+# VAD Helper
+# ============================================
+def detect_pause(audio: np.ndarray, sr: int, state: AppState) -> bool:
+    """Simple energy-based pause detection."""
+    if audio is None or len(audio) < sr * 0.3:
+        return False
+    # Look at last 0.5 seconds
+    window = int(sr * 0.5)
+    recent = audio[-window:] if len(audio) >= window else audio
+    # Energy
+    recent_float = recent.astype(np.float32)
+    if recent.dtype == np.int16:
+        recent_float = recent_float / 32768.0
+    energy = float(np.sqrt(np.mean(recent_float ** 2)))
+    SILENCE_THRESHOLD = 0.01
+    # If earlier was loud and now quiet = pause
+    if len(audio) > window * 2:
+        earlier = audio[:-window]
+        earlier_float = earlier.astype(np.float32)
+        if earlier.dtype == np.int16:
+            earlier_float = earlier_float / 32768.0
+        earlier_energy = float(np.sqrt(np.mean(earlier_float ** 2)))
+        if earlier_energy > SILENCE_THRESHOLD * 2 and energy < SILENCE_THRESHOLD:
+            logger.info(f"Pause: earlier={earlier_energy:.4f}, now={energy:.4f}")
+            return True
+    return False
+def audio_to_wav_file(audio: np.ndarray, sr: int) -> str:
+    """Save audio to temp WAV file."""
+    audio_float = audio.astype(np.float32)
+    max_val = np.max(np.abs(audio_float))
+    if max_val > 0:
+        audio_float = audio_float / max_val
+    audio_int = (audio_float * 32767).astype(np.int16)
+    tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".wav")
+    with wave.open(tmp.name, 'wb') as w:
+        w.setnchannels(1)
+        w.setsampwidth(2)
+        w.setframerate(sr)
+        w.writeframes(audio_int.tobytes())
+    return tmp.name
+# ============================================
+# MCP
+# ============================================
+def start_mcp_server():
+    """
+    Start the local CRM MCP server (crm_mcp_server.py) in a background process.
+    Controlled by Settings.mcp_auto_start (MCP_AUTO_START env var).
+    """
+    settings = Settings()
+    if not getattr(settings, "mcp_auto_start", True):
+        logger.info("MCP auto-start disabled via settings.")
+        return None
+    script_path = os.path.join(os.path.dirname(__file__), "crm_mcp_server.py")
+    cmd = [sys.executable, script_path]
+    try:
+        proc = subprocess.Popen(
+            cmd,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE,
+        )
+        logger.info(f"Started CRM MCP server (PID={proc.pid}) using: {cmd}")
+    except Exception as e:
+        logger.error(f"Failed to start CRM MCP server: {e}")
+        return None
+    # Ensure child process is cleaned up when app exits
+    def _cleanup():
+        if proc.poll() is None:
+            logger.info("Stopping CRM MCP server...")
+            try:
+                proc.terminate()
+            except Exception:
+                pass
+    atexit.register(_cleanup)
+    return proc
+# ============================================
+# Chatbot
+# ============================================
+TOOL_CALL_RE = re.compile(
+    r'^\s*([a-zA-Z_][\w]*)\s*\((.*)\)\s*$', re.DOTALL
+)
+def parse_tool_call(text: str):
+    """
+    Extract tool_name and kwargs from something like:
+        tool_name(a=1, b="x")
+    Works even if surrounded by chatter or code fences.
+    """
+    # Remove code fences
+    cleaned = text.strip()
+    if "```" in cleaned:
+        parts = cleaned.split("```")
+        if len(parts) >= 2:
+            cleaned = parts[1]
+    # Find last candidate line
+    pattern = re.compile(r'^([a-zA-Z_]\w*)\s*\((.*)\)\s*$')
+    for line in reversed(cleaned.splitlines()):
+        line = line.strip()
+        m = pattern.match(line)
+        if not m:
+            continue
+        print(f"Tool call: {line}")
+        name, args_src = m.groups()
+        args_src = args_src.strip()
+        # No args
+        if not args_src:
+            return name, {}
+        try:
+            func_src = f"def _f({args_src}): pass"
+            module = ast.parse(func_src)
+            func_def = module.body[0]          # ast.FunctionDef
+            args = func_def.args
+            kwargs = {}
+            for arg, default in zip(args.args, args.defaults):
+                key = arg.arg
+                value = ast.literal_eval(default)
+                kwargs[key] = value
+            return name, kwargs
+        except Exception as e:
+            print("Argument parse error:", e)
+            return None
+    return None
+class Chatbot:
+    def __init__(self):
+        self.settings = Settings()
+        self.audio_service = AudioService(
+            api_key=self.settings.hf_token,
+            stt_provider="fal-ai",
+            stt_model=self.settings.stt_model,
+            tts_model=self.settings.tts_model,
+        )
+        self.llm_service = LLMService(
+            api_key=self.settings.llm_api_key,
+            model_name=self.settings.effective_model_name,
+        )
+        self.vision_client = OpenAI(
+            base_url=self.settings.NEBIUS_BASE_URL,
+            api_key=self.settings.NEBIUS_API_KEY
+        )
+        self.vision_model = self.settings.NEBIUS_MODEL
+        self.screen_service = get_screen_service()
+        self.history: list[dict] = []
+        self.mcp = MCPClient()
+        try:
+            self.tools = self.mcp.list_tools()
+        except Exception as e:
+            # fail gracefully, tools just won’t be used
+            logging.exception("Failed to load tools from MCP server: %s", e)
+            self.tools = []
+        self.tools_description = self._build_tools_description()
+    def _build_tools_description(self) -> str:
+        """Build a human-readable list of tools for the system prompt."""
+        if not getattr(self, "tools", None):
+            return "No tools are currently available."
+        lines = []
+        for t in self.tools:
+            name = t.get("name", "unknown_tool")
+            desc = t.get("description", "")
+            props = t.get("inputSchema", {}).get("properties", {})
+            args = ", ".join(
+                f'{k}: {v.get("type", "string")}'
+                for k, v in props.items()
+            )
+            lines.append(f"- {name}({args}) — {desc}")
+        return "\n".join(lines)
+    async def process(self, text: str, tts_enabled: bool = True) -> Tuple[str, Optional[str]]:
+        if not text.strip():
+            return "", None
+        # ---------- Phase 1: ask model what to do ----------
+        messages = self.llm_service.build_messages_with_tools(
+            system_prompt=get_generic_prompt(),
+            user_input=text,
+            tools_description=self.tools_description,
+            conversation_history=self.history,
+        )
+        first_reply = await self.llm_service.get_chat_completion(messages)
+        # Try to parse a tool call from the reply
+        tool_call = parse_tool_call(first_reply)
+        tool_result_str = None
+        if tool_call:
+            tool_name, tool_args = tool_call
+            try:
+                result = self.mcp.call_tool(tool_name, tool_args)
+                tool_result_str = (
+                    f"Tool {tool_name} succeeded with arguments {tool_args}.\n"
+                    f"Result (JSON):\n{json.dumps(result, indent=2)}"
+                )
+            except Exception as e:
+                tool_result_str = f"Tool {tool_name} failed: {e}"
+            # ---------- Phase 2: give tool result back to model ----------
+            messages = self.llm_service.build_messages_with_tools(
+                system_prompt=get_generic_prompt(),
+                user_input=text,
+                tools_description=self.tools_description,
+                conversation_history=self.history,
+                tool_results=tool_result_str,
+            )
+            reply = await self.llm_service.get_chat_completion(messages)
+        else:
+            # No tool call – just treat initial text as final answer
+            reply = first_reply
+        # Save final user + assistant messages in conversation history
+        self.history.append({"role": "user", "content": text})
+        self.history.append({"role": "assistant", "content": reply})
+        # ---------- Optional: TTS ----------
+        audio_path = None
+        if tts_enabled:
+            audio_bytes = await self.audio_service.text_to_speech(reply)
+            if audio_bytes:
+                tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".wav")
+                tmp.write(audio_bytes)
+                tmp.close()
+                audio_path = tmp.name
+        return reply, audio_path
+    async def transcribe(self, audio_path: str) -> str:
+        return await self.audio_service.speech_to_text(audio_path)
+    async def capture_screen(self, state: AppState, tts_enabled: bool) -> Tuple[List[Dict], Optional[str], AppState, str]:
+        """Capture screen and send to vision model."""
+        # Capture screenshot
+        capture = self.screen_service.capture()
+        if not capture:
+            return state.conversation, None, state, "❌ Capture failed"
+        logger.info(f"Screenshot captured: {capture.width}x{capture.height}")
+        # Send to vision model
+        try:
+            response = await asyncio.get_event_loop().run_in_executor(
+                None,
+                lambda: self.vision_client.chat.completions.create(
+                    model=self.vision_model,
+                    messages=[{
+                        "role": "user",
+                        "content": [
+                            {"type": "text", "text": "Ignore the ATLAS interface on the left-most side of the screen. Describe the content of the main window."},
+                            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{capture.image_b64}"}}
+                        ]
+                    }]
+                )
+            )
+            analysis = response.choices[0].message.content
+        except Exception as e:
+            logger.error(f"Vision error: {e}")
+            return state.conversation, None, state, f"❌ {str(e)}"
+        # Process through main LLM
+        reply, audio_path = await self.process(f"[Screen analysis]: {analysis}", tts_enabled)
+        state.conversation.append({"role": "user", "content": "[Screen captured]"})
+        state.conversation.append({"role": "assistant", "content": f"Vision output: {reply}. Incorporate the data in your context. Do not call any tool yet, await for further instructions."})
+        return state.conversation, audio_path, state, "📸 Done"
+# Start MCP server as soon as this module is imported
+mcp_proc = start_mcp_server()
+# Give it a brief moment to bind to the port
+time.sleep(1.0)
+bot = None  # type: ignore
+def tokens_present() -> bool:
+    """Check whether required env vars are already available."""
+    return bool(os.getenv("HF_TOKEN")) and bool(os.getenv("NEBIUS_API_KEY"))
+ENV_PATH = Path(__file__).parent / ".env"
+def upsert_env_var(key: str, value: str):
+    """
+    Update or append env var in .env file so it persists across runs.
+    Simple key=value per line, no fancy parsing.
+    """
+    if not value:
+        return
+    lines = []
+    if ENV_PATH.exists():
+        lines = ENV_PATH.read_text(encoding="utf-8").splitlines()
+    found = False
+    for i, line in enumerate(lines):
+        if line.startswith(f"{key}="):
+            lines[i] = f"{key}={value}"
+            found = True
+            break
+    if not found:
+        lines.append(f"{key}={value}")
+    ENV_PATH.write_text("\n".join(lines) + "\n", encoding="utf-8")
+def ensure_bot_initialized() -> Optional[str]:
+    """
+    Initialize the global Chatbot if tokens are present.
+    Returns an error message if tokens are missing, otherwise None.
+    """
+    global bot
+    if bot is not None:
+        return None
+    hf_token = os.getenv("HF_TOKEN", "")
+    if not hf_token or len(hf_token) <= 10:
+        return "⚠️ HF_TOKEN missing or invalid. Please fill it in the Setup section."
+    # Optional debug: see what we are about to use
+    settings = Settings()
+    logger.info(
+        f"Initializing Chatbot with HF token prefix={settings.hf_token[:4]}..., len={len(settings.hf_token)}"
+    )
+    bot = Chatbot()
+    return None
+def save_tokens(hf_token: str, nebius_api_key: str) -> str:
+    # basic sanity check
+    if hf_token and not hf_token.strip().startswith("hf_"):
+        return "❌ HF_TOKEN does not look like a Hugging Face token (should start with 'hf_')."
+    if hf_token:
+        os.environ["HF_TOKEN"] = hf_token.strip()
+        upsert_env_var("HF_TOKEN", hf_token.strip())
+    if nebius_api_key:
+        os.environ["NEBIUS_API_KEY"] = nebius_api_key.strip()
+        upsert_env_var("NEBIUS_API_KEY", nebius_api_key.strip())
+    # NOW build Chatbot + LLMService with the *current* env
+    err = ensure_bot_initialized()
+    if err:
+        return err
+    return "✅ Tokens saved and assistant initialized. You can now use Atlas."
+def check_tokens_on_load():
+    if tokens_present():
+        # env already has HF_TOKEN/NEBIUS_API_KEY: build Chatbot immediately
+        err = ensure_bot_initialized()
+        msg = "✅ Tokens loaded from .env. Atlas is ready." if not err else err
+        return (
+            gr.update(visible=False),  # hf_token_box
+            gr.update(visible=False),  # nebius_key_box
+            msg,
+        )
+    else:
+        return (
+            gr.update(visible=True),
+            gr.update(visible=True),
+            "⚠️ Please paste your HF_TOKEN and NEBIUS_API_KEY to start.",
+        )
+# ============================================
+# Gradio Handlers
+# ============================================
+def process_audio(audio: tuple, state: AppState):
+    """Process audio chunk. Return gr.Audio(recording=False) to stop."""
+    if audio is None:
+        return None, state
+    sr, data = audio
+    # Mono
+    if data.ndim > 1:
+        data = data.mean(axis=1)
+    # Accumulate
+    if state.stream is None:
+        state.stream = data
+        state.sampling_rate = sr
+    else:
+        state.stream = np.concatenate((state.stream, data))
+    # Energy check
+    data_float = data.astype(np.float32)
+    if data.dtype == np.int16:
+        data_float = data_float / 32768.0
+    energy = float(np.sqrt(np.mean(data_float ** 2)))
+    if energy > 0.015:
+        state.started_talking = True
+        logger.debug(f"Talking: energy={energy:.4f}")
+    # Pause check
+    state.pause_detected = detect_pause(state.stream, state.sampling_rate, state)
+    if state.pause_detected and state.started_talking:
+        logger.info("Pause detected - stopping recording")
+        return gr.Audio(recording=False), state
+    return None, state
+async def respond(state: AppState, tts_enabled: bool):
+    """Transcribe and respond when recording stops."""
+    if bot is None:
+        msg = "⚠️ Configure HF_TOKEN and NEBIUS_API_KEY in the Setup section before using voice."
+        state.conversation.append({"role": "assistant", "content": msg})
+        return None, AppState(conversation=state.conversation), state.conversation
+    if state.stream is None or len(state.stream) < 1000:
+        logger.info("No audio")
+        return None, AppState(conversation=state.conversation), state.conversation
+    logger.info(f"Processing {len(state.stream)} samples...")
+    wav_path = audio_to_wav_file(state.stream, state.sampling_rate)
+    transcript = await bot.transcribe(wav_path)
+    logger.info(f"Transcript: {transcript}")
+    if not transcript.strip():
+        return None, AppState(conversation=state.conversation), state.conversation
+    reply, audio_path = await bot.process(transcript, tts_enabled)
+    state.conversation.append({"role": "user", "content": transcript})
+    state.conversation.append({"role": "assistant", "content": reply})
+    return audio_path, AppState(conversation=state.conversation), state.conversation
+def start_recording(state: AppState):
+    """Restart recording."""
+    if not state.stopped:
+        return gr.Audio(recording=True)
+    return gr.Audio(recording=False)
+async def send_text(text: str, state: AppState, tts_enabled: bool):
+    if not text.strip():
+        return state.conversation, None, state, ""
+    if bot is None:
+        msg = "⚠️ Configure HF_TOKEN and NEBIUS_API_KEY in the Setup section before chatting."
+        state.conversation.append({"role": "assistant", "content": msg})
+        return state.conversation, None, state, ""
+    reply, audio_path = await bot.process(text, tts_enabled)
+    state.conversation.append({"role": "user", "content": text})
+    state.conversation.append({"role": "assistant", "content": reply})
+    return state.conversation, audio_path, state, ""
+async def capture_screen_handler(state: AppState, tts_enabled: bool):
+    if bot is None:
+        msg = "⚠️ Configure HF_TOKEN and NEBIUS_API_KEY in the Setup section before using screen capture."
+        return state.conversation, None, state, msg
+    return await bot.capture_screen(state, tts_enabled)
+# ============================================
+# UI
+# ============================================
+with gr.Blocks(title="ATLAS") as demo:
+    gr.Markdown("### Atlas - CRM Voice Assistant")
+    state = gr.State(value=AppState())
+    with gr.Row():
+        with gr.Column(scale=2):
+            chatbot = gr.Chatbot(label="Conversation", height=400)
+            with gr.Row():
+                txt = gr.Textbox(placeholder="Type here your message...", label="Input", scale=4)
+                send_btn = gr.Button("Send", scale=1)
+        with gr.Column(scale=1):
+            # 🔐 Setup section
+            gr.Markdown("### Setup (API keys)")
+            hf_token_box = gr.Textbox(
+                placeholder="Paste your HuggingFace token (HF_TOKEN)",
+                label="HF_TOKEN",
+                type="password"
+            )
+            nebius_key_box = gr.Textbox(
+                placeholder="Paste your Nebius API key (NEBIUS_API_KEY)",
+                label="NEBIUS_API_KEY",
+                type="password"
+            )
+            save_keys_btn = gr.Button("Save keys & initialize Atlas")
+            setup_status = gr.Markdown("")
+            gr.Markdown("---")
+            gr.Markdown("### Speech module")
+            mic = gr.Audio(
+                sources=["microphone"],
+                type="numpy",
+                label="Microphone",
+                streaming=True,
+            )
+            audio_out = gr.Audio(label="Response", autoplay=True, streaming=True)
+            tts_toggle = gr.Checkbox(label="🔊 TTS Enabled", value=True)
+            stop_btn = gr.Button("🛑 Stop", variant="stop")
+            gr.Markdown("---")
+            gr.Markdown("### 🖥️ Screen")
+            capture_btn = gr.Button("📸 Capture Screen")
+            screen_status = gr.Textbox(label="Status", value="Ready", interactive=False)
+    # Stream -> detect pause -> stop
+    mic.stream(
+        process_audio,
+        inputs=[mic, state],
+        outputs=[mic, state],
+        stream_every=0.5,
+        time_limit=60,
+    )
+    # Stop -> transcribe -> respond -> restart
+    mic.stop_recording(
+        respond,
+        inputs=[state, tts_toggle],
+        outputs=[audio_out, state, chatbot],
+    ).then(
+        start_recording,
+        inputs=[state],
+        outputs=[mic],
+    )
+    stop_btn.click(
+        lambda: (AppState(stopped=True), gr.Audio(recording=False)),
+        outputs=[state, mic],
+    )
+    send_btn.click(send_text, inputs=[txt, state, tts_toggle], outputs=[chatbot, audio_out, state, txt])
+    txt.submit(send_text, inputs=[txt, state, tts_toggle], outputs=[chatbot, audio_out, state, txt])
+    # Screen capture
+    capture_btn.click(
+        capture_screen_handler,
+        inputs=[state, tts_toggle],
+        outputs=[chatbot, audio_out, state, screen_status]
+    )
+    # When app loads, show/hide token inputs based on env
+    demo.load(
+        fn=check_tokens_on_load,
+        inputs=None,
+        outputs=[hf_token_box, nebius_key_box, setup_status],
+    )
+    # When user clicks "Save keys"
+    save_keys_btn.click(
+        fn=save_tokens,
+        inputs=[hf_token_box, nebius_key_box],
+        outputs=[setup_status],
+    )
+if __name__ == "__main__":
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        theme=gr.themes.Default()
+    )

config/prompts.py ADDED Viewed

	@@ -0,0 +1,198 @@

+"""System prompts for the ATLAS assistant."""
+def get_generic_prompt() -> str:
+    """Return the main system prompt for the LLM."""
+    return """# ATLAS — Tool Use with High-Reliability Calling
+You are **ATLAS**, an intelligent assistant with access to tools. Your priority is **accurate, schema-correct tool calls**. Treat tools as **verifiable data sources** that complement your reasoning.
+---
+## Capabilities (context)
+1) **CRM Access** — search/retrieve customers, deals, documents.
+2) **Screen Analysis** — reason over shared visuals. Utilize vision data to enhance your context, do not call tools based on the response.
+3) **Voice Interaction** — natural, concise speech.
+> Use tools to fetch facts needed for the user's request.
+---
+## Tool Calling Protocol (STRICT)
+### A. Availability & Name Match
+- Call **only** tools listed in `TOOLS` and **exactly** by their key (e.g., `get_customer`, **not** `getCustomer`).
+- If a needed capability isn't in `TOOLS`, **do not invent** a call. Explain the limitation or ask for alternatives.
+### B. One Tool Per Message
+- Each assistant turn either (1) calls **exactly one** tool, or (2) asks a **targeted question** if parameters are missing/ambiguous, or (3) answers from known information.
+- After a tool call, wait for the tool's result before any further calls.
+### C. Function-Call Format (no extra text)
+When you need to call a tool, reply **only** with:
+name_of_the_tool(arg1="value1", arg2="value2")
+- No prefix/suffix text appended to the tool i.e. tool_name=get_customer(customer_id="Walnut"). This should be get_customer(customer_id="Walnut")
+- No Markdown fences.
+- String values must be quoted. Numbers unquoted.
+- Include **only** schema fields; no extras.
+### D. Parameter Sourcing & Validation (pre-call checklist)
+Before **every** call, resolve and validate each parameter:
+1) **Source Map** each param:
+   - `user` (explicitly provided by the user)
+   - `context` (already mentioned in conversation)
+   - `prior_tool` (value returned by an earlier tool)
+   - `agent_generated` (only safe defaults allowed by schema; never guess IDs/names)
+2) **Schema Conformance**:
+   - Required fields present.
+   - Correct **types**, **casing**, and allowed values (e.g., `stage` matches provided enum).
+   - Respect defaults (e.g., `limit` defaults to 50; omit if not needed).
+3) **Disambiguation**:
+   - If a required parameter is **uncertain**, **ask a short clarifying question** instead of calling the tool.
+---
+## Response Guidelines (around tool calls)
+- **Post-result:** cite **concrete fields** (names, IDs, amounts, timestamps). Avoid generic claims like “found an error”; quote the actual value or message.
+- **Screen/Voice context:** reference what you see/hear but **do not** call tools unless needed to satisfy the request.
+- **Uncertainty:** say “I'm not sure” and propose a data-gathering step (with a proper tool call) instead of guessing.
+---
+## Reliability Heuristics
+1) **Plan → Execute**
+   - First, clarify the goal and required parameters.
+   - Then choose the **single** best tool.
+   - Execute exactly one call.
+2) **Parameter Echo (mentally, not in the tool call)**
+   - Ensure each param is justified by a source map before calling.
+   - Example mental check: `customer_id ← prior_tool:get_customers[...]` or `user provided`.
+3) **No Hallucinated Entities**
+   - Do **not** invent customer IDs, deal IDs, document names, or fields. If you only have “Walnut” as a **name**, and the schema requires `customer_id`, use it as so.
+4) **Branching Discipline**
+   - If a result contradicts your assumption (e.g., access is disabled), choose the next **logical** tool (e.g., `get_access` → possibly `set_access` **only with consent**) or report options; don't chain speculative calls.
+5) **Minimal Surface**
+   - Prefer precise queries (filters like `status`, `industry`, `stage`, `limit`) to reduce noise.
+---
+## Tool Catalog Reminders (schema edges)
+- `get_customer` requires **`customer_id`**.
+- `get_deals` supports filters: `stage`, `customer_id`, `owner`, `min_value`, `limit`. Ensure `stage` matches the allowed set.
+- `read_document` requires **`name`** (exact or partial). Do not make up the content, invoke a tool call to get the document.
+- `get_pipeline_summary` and `get_documents` take **no params** (empty object).
+---
+## Failure Handling & Recovery
+- On failure, explain succinctly:
+  - **What** failed (`tool`, error text),
+  - **Why** (schema mismatch, missing param, backend error),
+  - **Next**: propose a compliant retry or an alternative tool.
+- Do **not** retry automatically unless you fixed the cause (e.g., added required param).
+---
+> When you are ready to call a tool, send **only** the function call in the required format. Otherwise, ask a clarifying question or summarize findings with cited fields.
+"""
+def get_vision_prompt() -> str:
+    """Return the prompt for the vision/screen analysis model."""
+    return """You are a visual analysis assistant helping a user with their screen content.
+## Mission
+Analyze the current screen and **prioritize the actionable email** in the foreground. Extract the problem, requests, blockers, and any embedded evidence (e.g., error snippets), then output a **structured to-do object** the user can act on immediately.
+## Guidelines
+1) **Be Specific:** Cite on-screen text (subjects, status codes, error messages, labels).
+2) **Be Relevant:** Focus on the active email/thread and its required actions.
+3) **Be Concise:** Short bullet summaries + a tight to-do list.
+4) **Note Changes (if follow-up):** Briefly mention what’s new versus prior view.
+5) **PII Caution:** Redact personal names/emails (use roles like “Technical Account Manager”). Company names/products are fine.
+6) **Evidence First:** Prefer exact on-screen snippets for errors/status (e.g., `401 Unauthorized`, `"Access not authorized for this company."`).
+## What to Extract (Email-Focused)
+- Active app/window (e.g., “Outlook/Email”)
+- Subject / thread topic
+- Sender role (redact personal name), company (if shown)
+- Key asks/requirements (bullets)
+- Pain points/blockers (bullets)
+- Evidence snippet(s) (codes/messages/log lines visible)
+- Attachments or embedded artifacts (if any)
+- Implied deadlines/urgency (if stated)
+- Suggested immediate next steps
+## Output (JSON only)
+Return **only** a JSON object with this shape:
+{
+  "context": {
+    "active_app": "<string>",
+    "subject": "<string>",
+    "sender_role": "<string|null>",
+    "sender_company": "<string|null>",
+    "received_time": "<string|null>"
+  },
+  "summary": "<1-2 sentence plain-language recap>",
+  "requirements": ["<ask 1>", "<ask 2>", "..."],
+  "blockers": ["<blocker 1>", "..."],
+  "evidence": [
+    {"type": "status", "value": "<e.g., 401 Unauthorized>"},
+    {"type": "message", "value": "<exact error text>"}
+  ],
+  "todos": [
+    {"title": "<actionable task>", "owner": "<me|team>", "priority": "<high|med|low>", "due": "<ISO8601|null>"},
+    {"title": "...", "owner": "...", "priority": "...", "due": "..."}
+  ],
+  "suggested_reply": "<concise draft reply to sender without PII>"
+}
+If a field is unknown, use null. Keep values brief and factual.
+## Ignore
+- System tray and background windows unless directly relevant.
+- Personal info (names/emails) — redact or omit.
+Respond with the JSON object only.
+"""
+def get_tool_execution_prompt() -> str:
+    """Return the prompt for tool execution context."""
+    return """Based on the tool execution results, provide a helpful response to the user.
+If the tool succeeded:
+- Summarize the key information returned
+- Highlight what's most relevant to the user's query
+- Suggest follow-up actions if appropriate
+If the tool failed:
+- Explain what went wrong in simple terms
+- Suggest alternative approaches
+- Offer to try a different tool if available
+"""
+def get_vad_context_prompt() -> str:
+    """Return the prompt for voice activity detection context."""
+    return """The user is speaking to you via voice. Keep your response:
+- Conversational and natural
+- Concise (suitable for text-to-speech)
+- Clear and easy to understand when heard
+- Free of complex formatting (no bullet points, tables, etc.)
+"""

config/settings.py ADDED Viewed

	@@ -0,0 +1,154 @@

+"""Application-wide configuration settings."""
+import os
+from dataclasses import dataclass, field
+from typing import Optional
+from pathlib import Path
+from dotenv import load_dotenv
+load_dotenv()
+@dataclass
+class Settings:
+    """Application-wide configuration settings."""
+    # ============================================
+    # LLM Provider Settings
+    # ============================================
+    llm_provider: str = os.getenv("LLM_PROVIDER", "auto")
+    # Hugging Face settings
+    hf_token: str = os.getenv("HF_TOKEN", "")
+    hf_chat_model: str = os.getenv("HF_CHAT_MODEL", "Qwen/Qwen2.5-7B-Instruct")
+    hf_temperature: float = float(os.getenv("HF_TEMPERATURE", "0.001"))
+    hf_max_new_tokens: int = int(os.getenv("HF_MAX_NEW_TOKENS", "512"))
+    # Model settings
+    model_name: str = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-7B-Instruct")
+    # ============================================
+    # Audio Provider Settings
+    # ============================================
+    audio_provider: str = os.getenv("AUDIO_PROVIDER", "auto")
+    tts_model: str = os.getenv("TTS_MODEL", "hexgrad/Kokoro-82M")
+    stt_model: str = os.getenv("STT_MODEL", "openai/whisper-large-v3")
+    # ============================================
+    # VAD (Voice Activity Detection) Settings
+    # ============================================
+    vad_enabled: bool = os.getenv("VAD_ENABLED", "true").lower() == "true"
+    vad_sample_rate: int = int(os.getenv("VAD_SAMPLE_RATE", "16000"))
+    vad_frame_duration_ms: int = int(os.getenv("VAD_FRAME_DURATION_MS", "30"))
+    vad_aggressiveness: int = int(os.getenv("VAD_AGGRESSIVENESS", "2"))
+    vad_speech_threshold: float = float(os.getenv("VAD_SPEECH_THRESHOLD", "0.5"))
+    vad_silence_threshold: float = float(os.getenv("VAD_SILENCE_THRESHOLD", "0.3"))
+    vad_min_speech_ms: int = int(os.getenv("VAD_MIN_SPEECH_MS", "300"))
+    vad_max_speech_s: float = float(os.getenv("VAD_MAX_SPEECH_S", "30.0"))
+    vad_post_speech_silence_ms: int = int(os.getenv("VAD_POST_SPEECH_SILENCE_MS", "800"))
+    # ============================================
+    # Screen/Vision Settings
+    # ============================================
+    screen_capture_interval: float = float(os.getenv("SCREEN_CAPTURE_INTERVAL", "1.0"))
+    screen_compression_quality: int = int(os.getenv("SCREEN_COMPRESSION_QUALITY", "50"))
+    max_width: int = int(os.getenv("SCREEN_MAX_WIDTH", "3440"))
+    max_height: int = int(os.getenv("SCREEN_MAX_HEIGHT", "1440"))
+    # Vision model (Nebius)
+    NEBIUS_MODEL: str = os.getenv("NEBIUS_MODEL", "google/gemma-3-27b-it-fast")
+    NEBIUS_API_KEY: str = os.getenv("NEBIUS_API_KEY", "")
+    NEBIUS_BASE_URL: str = os.getenv("NEBIUS_BASE_URL", "https://api.studio.nebius.com/v1/")
+    # Auto-enable vision when screen context is needed
+    vision_auto_enabled: bool = os.getenv("VISION_AUTO_ENABLED", "true").lower() == "true"
+    vision_fps: float = float(os.getenv("VISION_FPS", "0.05"))  # Frames per second
+    # ============================================
+    # MCP Server Settings
+    # ============================================
+    mcp_server_url: str = os.getenv("MCP_SERVER_URL", "http://localhost:8000")
+    mcp_auto_start: bool = os.getenv("MCP_AUTO_START", "true").lower() == "true"
+    # ============================================
+    # CRM Data Settings
+    # ============================================
+    crm_data_dir: str = os.getenv("CRM_DATA_DIR", "./data")
+    # ============================================
+    # Hyper-V Settings (Legacy)
+    # ============================================
+    hyperv_enabled: bool = os.getenv("HYPERV_ENABLED", "false").lower() == "true"
+    hyperv_host: str = os.getenv("HYPERV_HOST", "localhost")
+    hyperv_username: Optional[str] = os.getenv("HYPERV_USERNAME")
+    hyperv_password: Optional[str] = os.getenv("HYPERV_PASSWORD")
+    # ============================================
+    # Application Settings
+    # ============================================
+    max_conversation_history: int = int(os.getenv("MAX_CONVERSATION_HISTORY", "50"))
+    temp_dir: str = os.getenv("TEMP_DIR", "./temp")
+    log_level: str = os.getenv("LOG_LEVEL", "INFO")
+    # Feature flags
+    enable_screen_sharing_button: bool = os.getenv("ENABLE_SCREEN_SHARING_BUTTON", "true").lower() == "true"
+    enable_voice_input: bool = os.getenv("ENABLE_VOICE_INPUT", "true").lower() == "true"
+    def __post_init__(self):
+        """Initialize directories and validate settings."""
+        # Ensure necessary directories exist
+        Path(self.temp_dir).mkdir(exist_ok=True, parents=True)
+        Path("./config").mkdir(exist_ok=True, parents=True)
+        Path("./logs").mkdir(exist_ok=True, parents=True)
+        Path(self.crm_data_dir).mkdir(exist_ok=True, parents=True)
+        # 🔁 Refresh dynamic, env-backed values so they pick up changes done at runtime
+        self.hf_token = os.getenv("HF_TOKEN", self.hf_token)
+        self.NEBIUS_API_KEY = os.getenv("NEBIUS_API_KEY", self.NEBIUS_API_KEY)
+    def is_hf_token_valid(self) -> bool:
+        """Check if HuggingFace token is set and looks like a real HF token."""
+        token = os.getenv("HF_TOKEN", "")  # always read the latest env
+        return bool(token and token.startswith("hf_") and len(token) > 20)
+    @property
+    def effective_llm_provider(self) -> str:
+        if self.llm_provider == "auto":
+            return "huggingface" if self.is_hf_token_valid() else "openai"
+        return self.llm_provider
+    @property
+    def effective_audio_provider(self) -> str:
+        if self.audio_provider == "auto":
+            return "huggingface" if self.is_hf_token_valid() else "openai"
+        return self.audio_provider
+    @property
+    def llm_endpoint(self) -> str:
+        if self.effective_llm_provider == "huggingface":
+            return f"https://api-inference.huggingface.co/models/{self.hf_chat_model}"
+        return getattr(self, 'openai_endpoint', '')
+    @property
+    def llm_api_key(self) -> str:
+        if self.effective_llm_provider == "huggingface":
+            return os.getenv("HF_TOKEN", "")  # latest HF token
+        return getattr(self, "openai_api_key", "")
+    @property
+    def effective_model_name(self) -> str:
+        return self.hf_chat_model if self.effective_llm_provider == "huggingface" else self.model_name
+    def get_vad_config(self) -> dict:
+        """Get VAD configuration as a dictionary."""
+        return {
+            "sample_rate": self.vad_sample_rate,
+            "frame_duration_ms": self.vad_frame_duration_ms,
+            "aggressiveness": self.vad_aggressiveness,
+            "speech_threshold": self.vad_speech_threshold,
+            "silence_threshold": self.vad_silence_threshold,
+            "min_speech_duration_ms": self.vad_min_speech_ms,
+            "max_speech_duration_s": self.vad_max_speech_s,
+            "post_speech_silence_ms": self.vad_post_speech_silence_ms,
+        }

crm_mcp_server.py ADDED Viewed

	@@ -0,0 +1,525 @@

+"""
+CRM MCP Server - Local MCP server with mocked CRM data.
+Provides tools for customer, deal, and document management.
+"""
+import json
+import os
+import logging
+from pathlib import Path
+from typing import Optional, List, Dict, Any
+from datetime import datetime
+from fastapi import FastAPI, HTTPException
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Initialize FastAPI app
+app = FastAPI(title="CRM MCP Server", version="1.0.0")
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Data directory path
+DATA_DIR = Path(__file__).parent / "data"
+ACCESS_FILE = DATA_DIR / "access.json"
+class ToolCallRequest(BaseModel):
+    name: str
+    arguments: Dict[str, Any] = {}
+class ToolResponse(BaseModel):
+    success: bool
+    result: Any = None
+    error: str = None
+# ============================================
+# Data Loading Functions
+# ============================================
+def load_customers() -> Dict:
+    """Load customers from JSON file."""
+    customers_file = DATA_DIR / "customers.json"
+    if customers_file.exists():
+        with open(customers_file, "r") as f:
+            return json.load(f)
+    return {"customers": []}
+def load_deals() -> Dict:
+    """Load deals from JSON file."""
+    deals_file = DATA_DIR / "deals.json"
+    if deals_file.exists():
+        with open(deals_file, "r") as f:
+            return json.load(f)
+    return {"deals": [], "pipeline_summary": {}}
+def load_documents() -> List[Dict]:
+    """Load list of available documents."""
+    docs_dir = DATA_DIR / "documents"
+    documents = []
+    if docs_dir.exists():
+        for doc_path in docs_dir.glob("*"):
+            if doc_path.is_file():
+                stat = doc_path.stat()
+                documents.append({
+                    "name": doc_path.name,
+                    #"path": str(doc_path),
+                    "size_bytes": stat.st_size,
+                    "modified_at": datetime.fromtimestamp(stat.st_mtime).isoformat(),
+                    "type": doc_path.suffix.lstrip(".")
+                })
+    return documents
+def load_access_data() -> Dict[str, Any]:
+    if ACCESS_FILE.exists():
+        with open(ACCESS_FILE, "r", encoding="utf-8") as f:
+            return json.load(f)
+    return {"access": []}
+def save_access_data(data: Dict[str, Any]) -> None:
+    ACCESS_FILE.parent.mkdir(parents=True, exist_ok=True)
+    with open(ACCESS_FILE, "w", encoding="utf-8") as f:
+        json.dump(data, f, indent=2)
+# ============================================
+# Tool Implementations
+# ============================================
+def get_customers(
+    status: Optional[str] = None,
+    industry: Optional[str] = None,
+    limit: int = 50
+) -> Dict:
+    """Get list of customers with optional filtering."""
+    data = load_customers()
+    customers = data.get("customers", [])
+    # Apply filters
+    if status:
+        customers = [c for c in customers if c.get("status", "").lower() == status.lower()]
+    if industry:
+        customers = [c for c in customers if industry.lower() in c.get("industry", "").lower()]
+    # Apply limit
+    customers = customers[:limit]
+    return {
+        "total": len(customers),
+        "customers": customers
+    }
+def get_customer(customer_id: str) -> Dict:
+    """Get a specific customer by ID."""
+    data = load_customers()
+    for customer in data.get("customers", []):
+        if customer.get("id") == customer_id:
+            # Also get related deals
+            deals_data = load_deals()
+            related_deals = [
+                d for d in deals_data.get("deals", [])
+                if d.get("customer_id") == customer_id
+            ]
+            customer["related_deals"] = related_deals
+            customer.pop("tags")
+            return customer
+    return {"error": f"Customer {customer_id} not found"}
+def get_deals(
+    stage: Optional[str] = None,
+    customer_id: Optional[str] = None,
+    owner: Optional[str] = None,
+    min_value: Optional[float] = None,
+    limit: int = 50
+) -> Dict:
+    """Get list of deals with optional filtering."""
+    data = load_deals()
+    deals = data.get("deals", [])
+    # Apply filters
+    if stage:
+        deals = [d for d in deals if d.get("stage", "").lower() == stage.lower()]
+    if customer_id:
+        deals = [d for d in deals if d.get("customer_id") == customer_id]
+    if owner:
+        deals = [d for d in deals if owner.lower() in d.get("owner", "").lower()]
+    if min_value is not None:
+        deals = [d for d in deals if d.get("value", 0) >= min_value]
+    # Apply limit
+    deals = deals[:limit]
+    return {
+        "total": len(deals),
+        "deals": deals,
+        "pipeline_summary": data.get("pipeline_summary", {})
+    }
+def get_deal(deal_id: str) -> Dict:
+    """Get a specific deal by ID."""
+    data = load_deals()
+    for deal in data.get("deals", []):
+        if deal.get("id") == deal_id:
+            return deal
+    return {"error": f"Deal {deal_id} not found"}
+def get_pipeline_summary() -> Dict:
+    """Get sales pipeline summary."""
+    data = load_deals()
+    deals = data.get("deals", [])
+    # Calculate fresh summary
+    open_deals = [d for d in deals if d.get("stage") not in ["closed_won", "closed_lost"]]
+    summary = {
+        "total_deals": len(deals),
+        "open_deals": len(open_deals),
+        "total_pipeline_value": sum(d.get("value", 0) for d in open_deals),
+        "weighted_value": sum(
+            d.get("value", 0) * d.get("probability", 0) / 100
+            for d in open_deals
+        ),
+        "by_stage": {},
+        "by_owner": {},
+        "expected_closes_this_month": []
+    }
+    # Group by stage
+    for deal in deals:
+        stage = deal.get("stage", "unknown")
+        if stage not in summary["by_stage"]:
+            summary["by_stage"][stage] = {"count": 0, "value": 0}
+        summary["by_stage"][stage]["count"] += 1
+        summary["by_stage"][stage]["value"] += deal.get("value", 0)
+    # Group by owner
+    for deal in open_deals:
+        owner = deal.get("owner", "Unassigned")
+        if owner not in summary["by_owner"]:
+            summary["by_owner"][owner] = {"count": 0, "value": 0}
+        summary["by_owner"][owner]["count"] += 1
+        summary["by_owner"][owner]["value"] += deal.get("value", 0)
+    # Deals expected to close this month
+    current_month = datetime.now().strftime("%Y-%m")
+    for deal in open_deals:
+        close_date = deal.get("expected_close", "")
+        if close_date.startswith(current_month):
+            summary["expected_closes_this_month"].append({
+                "id": deal.get("id"),
+                "title": deal.get("title"),
+                "value": deal.get("value"),
+                "probability": deal.get("probability")
+            })
+    return summary
+def get_documents() -> Dict:
+    """Get list of available documents."""
+    documents = load_documents()
+    return {
+        "total": len(documents),
+        "documents": documents,
+        "instructions": "Utilize the exact file name as param for the read_document tool"
+    }
+def read_document(name: str) -> Dict:
+    """Read content of a specific document."""
+    docs_dir = DATA_DIR / "documents"
+    doc_path = docs_dir / name
+    if not doc_path.exists():
+        # Try to find partial match
+        for doc in docs_dir.glob("*"):
+            if name.lower() in doc.name.lower():
+                doc_path = doc
+                break
+    if not doc_path.exists():
+        return {"error": f"Document '{name}' not found"}
+    try:
+        with open(doc_path, "r", encoding="utf-8") as f:
+            content = f.read()
+        return {
+            "name": doc_path.name,
+            "content": content,
+            "size_bytes": len(content),
+            "type": doc_path.suffix.lstrip(".")
+        }
+    except Exception as e:
+        return {"error": f"Failed to read document: {str(e)}"}
+def search_documents(query: str) -> Dict:
+    """Search documents by content."""
+    docs_dir = DATA_DIR / "documents"
+    query_lower = query.lower()
+    matches = []
+    if docs_dir.exists():
+        for doc_path in docs_dir.glob("*"):
+            if doc_path.is_file():
+                try:
+                    with open(doc_path, "r", encoding="utf-8") as f:
+                        content = f.read()
+                    if query_lower in content.lower() or query_lower in doc_path.name.lower():
+                        # Find relevant excerpts
+                        lines = content.split("\n")
+                        relevant_lines = [
+                            line.strip() for line in lines
+                            if query_lower in line.lower()
+                        ][:3]  # Max 3 relevant lines
+                        matches.append({
+                            "name": doc_path.name,
+                            "type": doc_path.suffix.lstrip("."),
+                            "relevant_excerpts": relevant_lines
+                        })
+                except Exception:
+                    continue
+    return {
+        "query": query,
+        "total_matches": len(matches),
+        "documents": matches
+    }
+def get_access(customer_name: str) -> Dict[str, Any]:
+    """Look up whether access is enabled for the given customer."""
+    data = load_access_data()
+    name_lower = customer_name.lower()
+    for entry in data.get("access", []):
+        if entry.get("customer_name", "").lower() == name_lower:
+            return entry
+    return {"customer_name": customer_name, "enabled": False}
+def set_access(customer_name: str) -> Dict[str, Any]:
+    """Set access enabled = true for the given customer."""
+    data = load_access_data()
+    name_lower = customer_name.lower()
+    found = None
+    for entry in data.get("access", []):
+        if entry.get("customer_name", "").lower() == name_lower:
+            entry["enabled"] = True
+            found = entry
+            break
+    if not found:
+        found = {"customer_name": customer_name, "enabled": True}
+        data.setdefault("access", []).append(found)
+    save_access_data(data)
+    return found
+# ============================================
+# Tool Registry
+# ============================================
+TOOLS = {
+    "get_customers": {
+        "description": "Get list of customers. Optionally filter by status (active/inactive/prospect) or industry.",
+        "function": get_customers,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "status": {"type": "string", "description": "Filter by status: active, inactive, or prospect"},
+                "industry": {"type": "string", "description": "Filter by industry"},
+                "limit": {"type": "integer", "description": "Maximum number of results", "default": 50}
+            }
+        }
+    },
+    "get_customer": {
+        "description": "Get detailed information about a specific customer by name.",
+        "function": get_customer,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "customer_id": {"type": "string", "description": "Customer ID (e.g., Walnut)"}
+            },
+            "required": ["customer_id"]
+        }
+    },
+    "get_deals": {
+        "description": "Get list of deals/opportunities. Optionally filter by stage, customer, owner, or minimum value.",
+        "function": get_deals,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "stage": {"type": "string", "description": "Filter by stage: qualification, demo, proposal, negotiation, closed_won, closed_lost"},
+                "customer_id": {"type": "string", "description": "Filter by customer ID"},
+                "owner": {"type": "string", "description": "Filter by deal owner name"},
+                "min_value": {"type": "number", "description": "Minimum deal value"},
+                "limit": {"type": "integer", "description": "Maximum number of results", "default": 50}
+            }
+        }
+    },
+    "get_deal": {
+        "description": "Get detailed information about a specific deal by ID.",
+        "function": get_deal,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "deal_id": {"type": "string", "description": "Deal ID (e.g., DEAL-001)"}
+            },
+            "required": ["deal_id"]
+        }
+    },
+    "get_pipeline_summary": {
+        "description": "Get sales pipeline summary including totals by stage and owner.",
+        "function": get_pipeline_summary,
+        "inputSchema": {
+            "type": "object",
+            "properties": {}
+        }
+    },
+    "get_documents": {
+        "description": "Get list of available CRM documents related to the company at hand.",
+        "function": get_documents,
+        "inputSchema": {
+            "type": "object",
+            "properties": {}
+        }
+    },
+    "read_document": {
+        "description": "Read the content of a specific document by name.",
+        "function": read_document,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "name": {"type": "string", "description": "Document name or partial name"}
+            },
+            "required": ["name"]
+        }
+    },
+    "search_documents": {
+        "description": "Search documents by content or title.",
+        "function": search_documents,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "query": {"type": "string", "description": "Search query"}
+            },
+            "required": ["query"]
+        }
+    },
+        "get_access": {
+        "description": "Check whether endpoint access is enabled for a given customer.",
+        "function": get_access,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "customer_name": {
+                    "type": "string",
+                    "description": "Customer company name"
+                }
+            },
+            "required": ["customer_name"]
+        }
+    },
+    "set_access": {
+        "description": "Enable endpoint access for a given customer (sets enabled=true in access.json).",
+        "function": set_access,
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "customer_name": {
+                    "type": "string",
+                    "description": "Customer company name"
+                }
+            },
+            "required": ["customer_name"]
+        }
+    }
+}
+# ============================================
+# API Endpoints
+# ============================================
+@app.get("/")
+async def root():
+    """Health check endpoint."""
+    return {"status": "ok", "service": "CRM MCP Server", "version": "1.0.0"}
+@app.get("/tools")
+async def list_tools():
+    """List all available tools."""
+    tools_list = []
+    for name, config in TOOLS.items():
+        tools_list.append({
+            "name": name,
+            "description": config["description"],
+            "inputSchema": config["inputSchema"]
+        })
+    return {"tools": tools_list}
+@app.post("/tools/call")
+async def call_tool(request: ToolCallRequest):
+    """Execute a tool by name with arguments."""
+    tool_name = request.name
+    arguments = request.arguments
+    if tool_name not in TOOLS:
+        return ToolResponse(
+            success=False,
+            error=f"Unknown tool: {tool_name}"
+        )
+    try:
+        tool_func = TOOLS[tool_name]["function"]
+        result = tool_func(**arguments)
+        return ToolResponse(success=True, result=result)
+    except Exception as e:
+        logger.error(f"Tool execution error: {e}")
+        return ToolResponse(success=False, error=str(e))
+# ============================================
+# Main Entry Point
+# ============================================
+if __name__ == "__main__":
+    import uvicorn
+    # Ensure data directory exists
+    DATA_DIR.mkdir(parents=True, exist_ok=True)
+    (DATA_DIR / "documents").mkdir(exist_ok=True)
+    logger.info(f"Starting CRM MCP Server...")
+    logger.info(f"Data directory: {DATA_DIR}")
+    uvicorn.run(app, host="0.0.0.0", port=8000)

data/access.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "access": [
+    {
+      "customer_name": "Walnut",
+      "enabled": true
+    },
+    {
+      "customer_name": "TechStart Inc",
+      "enabled": false
+    }
+  ]
+}

data/customers.json ADDED Viewed

	@@ -0,0 +1,117 @@

+{
+  "customers": [
+    {
+      "id": "Walnut",
+      "name": "Walnut",
+      "contact_name": "Maria Brown",
+      "annual_revenue": "$50M",
+      "notes": "Key account. Had previous issues with integration endpoints.",
+      "tags": ["enterprise", "manufacturing", "high-value"]
+    },
+    {
+      "id": "TechStart Inc",
+      "name": "TechStart Inc",
+      "contact_name": "Sarah Johnson",
+      "contact_email": "sarah@techstart.io",
+      "contact_phone": "+1-555-0102",
+      "industry": "Technology",
+      "company_size": "50-100",
+      "annual_revenue": "$5M-$10M",
+      "status": "active",
+      "created_at": "2023-06-22",
+      "last_contact": "2025-05-18",
+      "notes": "Fast-growing startup. Looking for scalable solutions.",
+      "tags": ["startup", "tech", "growth"]
+    },
+    {
+      "id": "Global Finance Ltd",
+      "name": "Global Finance Ltd",
+      "contact_name": "Michael Chen",
+      "contact_email": "m.chen@globalfinance.com",
+      "contact_phone": "+1-555-0103",
+      "industry": "Financial Services",
+      "company_size": "1000+",
+      "annual_revenue": "$500M+",
+      "status": "active",
+      "created_at": "2022-11-08",
+      "last_contact": "2025-05-22",
+      "notes": "Enterprise client. Strict compliance requirements.",
+      "tags": ["enterprise", "finance", "compliance", "high-value"]
+    },
+    {
+      "id": "CUST-004",
+      "name": "Green Energy Solutions",
+      "contact_name": "Emma Davis",
+      "contact_email": "emma.davis@greenenergy.com",
+      "contact_phone": "+1-555-0104",
+      "industry": "Energy",
+      "company_size": "100-500",
+      "annual_revenue": "$20M-$50M",
+      "status": "active",
+      "created_at": "2024-02-14",
+      "last_contact": "2025-05-15",
+      "notes": "Sustainability-focused. Interested in IoT monitoring.",
+      "tags": ["energy", "sustainability", "iot"]
+    },
+    {
+      "id": "CUST-005",
+      "name": "HealthCare Plus",
+      "contact_name": "Dr. Robert Wilson",
+      "contact_email": "rwilson@healthcareplus.org",
+      "contact_phone": "+1-555-0105",
+      "industry": "Healthcare",
+      "company_size": "500-1000",
+      "annual_revenue": "$100M-$500M",
+      "status": "active",
+      "created_at": "2023-09-30",
+      "last_contact": "2025-05-21",
+      "notes": "Hospital network. HIPAA compliance is critical.",
+      "tags": ["healthcare", "compliance", "enterprise"]
+    },
+    {
+      "id": "CUST-006",
+      "name": "RetailMax",
+      "contact_name": "Lisa Anderson",
+      "contact_email": "l.anderson@retailmax.com",
+      "contact_phone": "+1-555-0106",
+      "industry": "Retail",
+      "company_size": "1000+",
+      "annual_revenue": "$200M-$500M",
+      "status": "inactive",
+      "created_at": "2022-05-18",
+      "last_contact": "2024-12-10",
+      "notes": "Churned due to budget cuts. Potential to re-engage in Q3.",
+      "tags": ["retail", "enterprise", "churned"]
+    },
+    {
+      "id": "CUST-007",
+      "name": "EduLearn Academy",
+      "contact_name": "Prof. James Taylor",
+      "contact_email": "jtaylor@edulearn.edu",
+      "contact_phone": "+1-555-0107",
+      "industry": "Education",
+      "company_size": "100-500",
+      "annual_revenue": "$10M-$20M",
+      "status": "prospect",
+      "created_at": "2025-03-01",
+      "last_contact": "2025-05-19",
+      "notes": "University looking for LMS integration. Demo scheduled.",
+      "tags": ["education", "prospect", "demo-scheduled"]
+    },
+    {
+      "id": "CUST-008",
+      "name": "LogiTrans Shipping",
+      "contact_name": "Carlos Rodriguez",
+      "contact_email": "carlos@logitrans.com",
+      "contact_phone": "+1-555-0108",
+      "industry": "Logistics",
+      "company_size": "500-1000",
+      "annual_revenue": "$50M-$100M",
+      "status": "active",
+      "created_at": "2023-04-12",
+      "last_contact": "2025-05-17",
+      "notes": "Fleet management client. Expanding to new regions.",
+      "tags": ["logistics", "fleet", "expansion"]
+    }
+  ]
+}

data/deals.json ADDED Viewed

	@@ -0,0 +1,146 @@

+{
+  "deals": [
+    {
+      "id": "DEAL-001",
+      "title": "Wolf Solutions",
+      "customer_id": "Wolf Solutions",
+      "customer_name": "Wolf Solutions",
+      "value": 250000,
+      "currency": "USD",
+      "stage": "negotiation",
+      "probability": 75,
+      "expected_close": "2025-06-30",
+      "created_at": "2025-02-15",
+      "owner": "Maria Brown",
+      "description": "Full cloud infrastructure migration including data center consolidation.",
+      "next_action": "Send revised proposal with volume discount",
+      "competitors": ["CloudCorp", "SkyNet Solutions"]
+    },
+    {
+      "id": "DEAL-002",
+      "title": "TechStart Platform License",
+      "customer_id": "CUST-002",
+      "customer_name": "TechStart Inc",
+      "value": 45000,
+      "currency": "USD",
+      "stage": "proposal",
+      "probability": 60,
+      "expected_close": "2025-07-15",
+      "created_at": "2025-04-01",
+      "owner": "Bob Martinez",
+      "description": "Annual platform license with premium support.",
+      "next_action": "Schedule technical deep-dive with their CTO",
+      "competitors": ["OpenPlatform"]
+    },
+    {
+      "id": "DEAL-003",
+      "title": "Global Finance Security Suite",
+      "customer_id": "CUST-003",
+      "customer_name": "Global Finance Ltd",
+      "value": 750000,
+      "currency": "USD",
+      "stage": "closed_won",
+      "probability": 100,
+      "expected_close": "2025-05-01",
+      "closed_at": "2025-04-28",
+      "created_at": "2024-11-10",
+      "owner": "Alice Brown",
+      "description": "Enterprise security suite with compliance modules.",
+      "next_action": "Kickoff implementation meeting",
+      "competitors": []
+    },
+    {
+      "id": "DEAL-004",
+      "title": "Green Energy IoT Pilot",
+      "customer_id": "CUST-004",
+      "customer_name": "Green Energy Solutions",
+      "value": 85000,
+      "currency": "USD",
+      "stage": "qualification",
+      "probability": 40,
+      "expected_close": "2025-08-30",
+      "created_at": "2025-05-01",
+      "owner": "Carol White",
+      "description": "IoT monitoring pilot for 3 solar farms.",
+      "next_action": "Conduct site assessment",
+      "competitors": ["IoTech", "SensorFlow"]
+    },
+    {
+      "id": "DEAL-005",
+      "title": "HealthCare Plus EHR Integration",
+      "customer_id": "CUST-005",
+      "customer_name": "HealthCare Plus",
+      "value": 320000,
+      "currency": "USD",
+      "stage": "negotiation",
+      "probability": 80,
+      "expected_close": "2025-06-15",
+      "created_at": "2025-01-20",
+      "owner": "Bob Martinez",
+      "description": "EHR system integration with existing hospital network.",
+      "next_action": "Legal review of BAA agreement",
+      "competitors": ["MedTech Systems"]
+    },
+    {
+      "id": "DEAL-006",
+      "title": "EduLearn LMS Package",
+      "customer_id": "CUST-007",
+      "customer_name": "EduLearn Academy",
+      "value": 120000,
+      "currency": "USD",
+      "stage": "demo",
+      "probability": 50,
+      "expected_close": "2025-09-01",
+      "created_at": "2025-03-15",
+      "owner": "Carol White",
+      "description": "Learning management system for 5000 students.",
+      "next_action": "Demo scheduled for May 25th",
+      "competitors": ["Canvas", "Moodle"]
+    },
+    {
+      "id": "DEAL-007",
+      "title": "LogiTrans Fleet Expansion",
+      "customer_id": "CUST-008",
+      "customer_name": "LogiTrans Shipping",
+      "value": 180000,
+      "currency": "USD",
+      "stage": "proposal",
+      "probability": 65,
+      "expected_close": "2025-07-30",
+      "created_at": "2025-04-10",
+      "owner": "Alice Brown",
+      "description": "Expand fleet management to 500 additional vehicles.",
+      "next_action": "Finalize pricing for Latin America region",
+      "competitors": ["FleetTrack Pro"]
+    },
+    {
+      "id": "DEAL-008",
+      "title": "Acme Support Renewal",
+      "customer_id": "CUST-001",
+      "customer_name": "Acme Corporation",
+      "value": 75000,
+      "currency": "USD",
+      "stage": "closed_won",
+      "probability": 100,
+      "expected_close": "2025-03-31",
+      "closed_at": "2025-03-28",
+      "created_at": "2025-02-01",
+      "owner": "Alice Brown",
+      "description": "Annual premium support renewal.",
+      "next_action": "None - deal closed",
+      "competitors": []
+    }
+  ],
+  "pipeline_summary": {
+    "total_pipeline_value": 1825000,
+    "weighted_pipeline_value": 1147500,
+    "deals_by_stage": {
+      "qualification": 1,
+      "demo": 1,
+      "proposal": 2,
+      "negotiation": 2,
+      "closed_won": 2
+    },
+    "average_deal_size": 228125
+  }
+}

data/documents/walnut integration roadmap.md ADDED Viewed

	@@ -0,0 +1,17 @@

+# Integration roadmap – Endpoint services
+Phase 1 – Preparation (Week 1) - Done
+- Confirm scope and target environments.
+- Enable API / endpoint access for staging tenant.
+- Share API keys & documentation with customer team.
+Phase 2 – Initial integration (Weeks 2–3) - Done
+- Implement authentication flow against Atlas endpoint services.
+- Set up test events (customer updates, subscription changes).
+- Validate logging and error handling.
+Phase 3 – Pilot rollout (Weeks 4–5) - Ongoing
+- Endpoint is enabled for the Staging environments.
+Phase 4 – Full rollout (Week 6+) - Scheduled: ETA December 1st
+- Enable the endpoints for the production environments.

data/documents/walnut meeting minutes.md ADDED Viewed

	@@ -0,0 +1,15 @@

+# Meeting minutes – Subscription renewal update (10 min check-in)
+- Date: 2025-11-15
+- Attendees: Andrei Zamfir (CSM) + Georgina Espinosa (Technical Account Manager)
+- Topic: Subscription renewal status
+Key points:
+- Reviewed current 12-month subscription, renewal due 2026-01-31.
+- Customer confirmed they intend to renew at current ARR.
+- Discussed progress on new feature integration (endpoint services).
+- Blocker: pending internal review for access request.
+Next steps:
+- Customer to complete access review by next status update meeting.
+- CSM to send recap email and share integration roadmap document.

requirements.txt ADDED Viewed

	@@ -0,0 +1,25 @@

+# Core dependencies
+gradio>=6.0.0
+huggingface_hub>=0.20.0
+openai>=1.0.0
+python-dotenv>=1.0.0
+# FastAPI for MCP server
+fastapi>=0.100.0
+uvicorn>=0.24.0
+pydantic>=2.0.0
+# Audio processing
+numpy>=1.24.0
+# Screen capture
+mss>=9.0.0
+Pillow>=10.0.0
+# Voice Activity Detection (optional)
+# Install these for automatic speech detection:
+# pyaudio>=0.2.13
+# webrtcvad>=2.0.10
+# HTTP client
+requests>=2.31.0

services/audio_service.py ADDED Viewed

	@@ -0,0 +1,121 @@

+"""
+Audio Service - Speech-to-Text and Text-to-Speech.
+"""
+import io
+import logging
+import tempfile
+import asyncio
+from typing import Optional, Union
+from huggingface_hub import InferenceClient
+from config.settings import Settings
+logger = logging.getLogger(__name__)
+class AudioService:
+    """Audio service for STT and TTS."""
+    def __init__(
+        self,
+        api_key: str,
+        stt_provider: str = "fal-ai",
+        stt_model: str = "openai/whisper-large-v3",
+        tts_model: str = "canopylabs/orpheus-3b-0.1-ft",
+    ):
+        """
+        Initialize audio service.
+        Args:
+            api_key: Hugging Face API token
+            stt_provider: Provider for speech-to-text
+            stt_model: ASR model ID
+            tts_model: TTS model ID
+        """
+        self.api_key = api_key
+        self.stt_model = stt_model
+        self.tts_model = tts_model
+        # STT client
+        logger.debug(f"Initializing ASR client with provider={stt_provider}")
+        self.asr_client = InferenceClient(
+            provider=stt_provider,
+            api_key=self.api_key,
+        )
+        # TTS client
+        logger.debug(f"Initializing TTS client")
+        self.tts_client = InferenceClient(token=self.api_key)
+        logger.info(f"AudioService configured: ASR={self.stt_model}, TTS={self.tts_model}")
+    async def speech_to_text(self, audio_input: Union[str, bytes, io.BytesIO]) -> str:
+        """
+        Convert speech to text.
+        Args:
+            audio_input: File path, bytes, or BytesIO of audio
+        Returns:
+            Transcribed text
+        """
+        # Prepare input path
+        if isinstance(audio_input, str):
+            input_path = audio_input
+            logger.debug(f"Using existing file for ASR: {input_path}")
+        else:
+            data = audio_input.getvalue() if isinstance(audio_input, io.BytesIO) else audio_input
+            tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".wav")
+            tmp.write(data)
+            tmp.close()
+            input_path = tmp.name
+            logger.debug(f"Wrote audio to temp file for ASR: {input_path}")
+        try:
+            logger.info(f"Calling ASR model={self.stt_model}")
+            result = await asyncio.get_event_loop().run_in_executor(
+                None,
+                lambda: self.asr_client.automatic_speech_recognition(
+                    input_path,
+                    model=self.stt_model,
+                )
+            )
+            transcript = result.get("text") if isinstance(result, dict) else getattr(result, "text", "")
+            logger.info(f"ASR success, transcript length={len(transcript)}")
+            return transcript or ""
+        except Exception as e:
+            logger.error(f"ASR error: {e}", exc_info=True)
+            return ""
+    async def text_to_speech(self, text: str) -> Optional[bytes]:
+        """
+        Convert text to speech.
+        Args:
+            text: Text to synthesize
+        Returns:
+            Audio bytes or None
+        """
+        if not text.strip():
+            logger.debug("Empty text input for TTS")
+            return None
+        def _call_tts():
+            try:
+                return self.tts_client.text_to_speech(text, model=self.tts_model)
+            except StopIteration as e:
+                raise RuntimeError(f"StopIteration in TTS call: {e}")
+        try:
+            logger.info(f"Calling TTS model={self.tts_model}, text length={len(text)}")
+            audio = await asyncio.get_event_loop().run_in_executor(None, _call_tts)
+            logger.info(f"TTS success, received {len(audio)} bytes")
+            return audio
+        except Exception as e:
+            logger.error(f"TTS error: {e}", exc_info=True)
+            return None

services/llm_service.py ADDED Viewed

	@@ -0,0 +1,167 @@

+"""
+LLM Service - Chat completions via HuggingFace.
+"""
+import logging
+from typing import Dict, List, Optional, Any
+from dataclasses import dataclass
+from huggingface_hub import InferenceClient
+from config.settings import Settings
+logger = logging.getLogger(__name__)
+@dataclass
+class LLMConfig:
+    """LLM configuration."""
+    api_key: str
+    model_name: str
+    temperature: float = 0.01
+    max_tokens: int = 512
+class LLMService:
+    """
+    LLM service using HuggingFace InferenceClient.
+    """
+    def __init__(
+        self,
+        api_key: Optional[str] = None,
+        model_name: Optional[str] = None,
+    ):
+        """
+        Initialize LLM service.
+        Args:
+            api_key: HuggingFace API key
+            model_name: Model name/ID
+        """
+        settings = Settings()
+        key = api_key or settings.hf_token
+        name = model_name or settings.effective_model_name
+        self.config = LLMConfig(
+            api_key=key,
+            model_name=name,
+            temperature=settings.hf_temperature,
+            max_tokens=settings.hf_max_new_tokens,
+        )
+        self.client = InferenceClient(token=self.config.api_key)
+        logger.info(f"LLMService initialized with model: {self.config.model_name}")
+    async def get_chat_completion(
+        self,
+        messages: List[Dict[str, str]],
+        temperature: Optional[float] = None,
+        max_tokens: Optional[int] = None,
+    ) -> str:
+        """
+        Get chat completion from the model.
+        Args:
+            messages: List of message dicts with 'role' and 'content'
+            temperature: Override temperature
+            max_tokens: Override max tokens
+        Returns:
+            Assistant response text
+        """
+        logger.debug(f"Chat completion request with model: {self.config.model_name}")
+        try:
+            response = self.client.chat_completion(
+                messages=messages,
+                model=self.config.model_name,
+                max_tokens=max_tokens or self.config.max_tokens,
+                temperature=temperature or self.config.temperature
+            )
+            content = response.choices[0].message.content
+            logger.debug(f"Chat completion response: {content[:200]}...")
+            return content
+        except Exception as e:
+            logger.error(f"Chat completion error: {str(e)}")
+            raise Exception(f"LLM completion error: {str(e)}")
+    async def get_streaming_completion(
+        self,
+        messages: List[Dict[str, str]],
+        temperature: Optional[float] = None,
+        max_tokens: Optional[int] = None,
+    ):
+        """
+        Get streaming chat completion.
+        Yields:
+            Text chunks as they're generated
+        """
+        logger.debug(f"Streaming completion request with model: {self.config.model_name}")
+        try:
+            stream = self.client.chat_completion(
+                messages=messages,
+                model=self.config.model_name,
+                max_tokens=max_tokens or self.config.max_tokens,
+                temperature=temperature or self.config.temperature,
+                stream=True
+            )
+            for chunk in stream:
+                if chunk.choices and chunk.choices[0].delta.content:
+                    yield chunk.choices[0].delta.content
+        except Exception as e:
+            logger.error(f"Streaming completion error: {str(e)}")
+            raise Exception(f"LLM streaming error: {str(e)}")
+    def build_messages_with_tools(
+        self,
+        system_prompt: str,
+        user_input: str,
+        tools_description: str = "",
+        conversation_history: Optional[List[Dict[str, str]]] = None,
+        tool_results: Optional[str] = None,
+    ) -> List[Dict[str, str]]:
+        """
+        Build messages array with tools and context.
+        Args:
+            system_prompt: System instruction
+            user_input: User's message
+            tools_description: Available tools description
+            conversation_history: Previous messages
+            tool_results: Results from tool execution
+        Returns:
+            Messages array for chat completion
+        """
+        messages = [{"role": "system", "content": system_prompt}]
+        if tools_description:
+            messages.append({
+                "role": "system",
+                "content": f"Available tools:\n{tools_description}"
+            })
+        # Add conversation history
+        if conversation_history:
+            for msg in conversation_history[-10:]:  # Last 10 messages
+                if msg.get("role") in ["user", "assistant"]:
+                    messages.append(msg)
+        # Add current user input
+        messages.append({"role": "user", "content": user_input})
+        # Add tool results if present
+        if tool_results:
+            messages.append({"role": "assistant", "content": tool_results})
+        return messages

services/mcp_client.py ADDED Viewed

	@@ -0,0 +1,24 @@

+# services/mcp_client.py
+import requests
+from typing import Any, Dict, List
+from config.settings import Settings
+class MCPClient:
+    def __init__(self, base_url: str | None = None):
+        settings = Settings()
+        self.base_url = (base_url or settings.mcp_server_url).rstrip("/")
+    def list_tools(self) -> List[Dict[str, Any]]:
+        resp = requests.get(f"{self.base_url}/tools", timeout=5)
+        resp.raise_for_status()
+        data = resp.json()
+        return data.get("tools", [])
+    def call_tool(self, name: str, arguments: Dict[str, Any] | None = None) -> Any:
+        payload = {"name": name, "arguments": arguments or {}}
+        resp = requests.post(f"{self.base_url}/tools/call", json=payload, timeout=30)
+        resp.raise_for_status()
+        data = resp.json()
+        if not data.get("success", False):
+            raise RuntimeError(data.get("error", "Unknown tool error"))
+        return data.get("result")

services/screen_service.py ADDED Viewed

	@@ -0,0 +1,115 @@

+"""
+Screen Service - Simple screenshot capture.
+Just captures screen and returns base64 image.
+"""
+import base64
+import io
+import logging
+import time
+from typing import Optional, Dict, Any
+from dataclasses import dataclass
+try:
+    import mss
+    MSS_AVAILABLE = True
+except ImportError:
+    MSS_AVAILABLE = False
+from PIL import Image
+logger = logging.getLogger(__name__)
+@dataclass
+class ScreenCapture:
+    """Represents a captured screen frame."""
+    timestamp: float
+    image_b64: str
+    width: int
+    height: int
+class ScreenService:
+    """Simple screen capture service."""
+    def __init__(
+        self,
+        monitor: int = 1,
+        max_width: int = 1920,
+        max_height: int = 1080,
+        compression_quality: int = 85,
+    ):
+        self.monitor = monitor
+        self.max_width = max_width
+        self.max_height = max_height
+        self.compression_quality = compression_quality
+        if not MSS_AVAILABLE:
+            logger.warning("mss not available. Screen capture disabled.")
+    def is_available(self) -> bool:
+        """Check if screen capture is available."""
+        return MSS_AVAILABLE
+    def _process_image(self, img: Image.Image) -> Image.Image:
+        """Process and resize image."""
+        if img.mode != "RGB":
+            img = img.convert("RGB")
+        w, h = img.size
+        ar = w / h
+        if w > self.max_width or h > self.max_height:
+            if ar > 1:
+                new_w = min(w, self.max_width)
+                new_h = int(new_w / ar)
+            else:
+                new_h = min(h, self.max_height)
+                new_w = int(new_h * ar)
+            img = img.resize((new_w, new_h), Image.Resampling.LANCZOS)
+        return img
+    def _image_to_base64(self, img: Image.Image) -> str:
+        """Convert image to base64 string."""
+        buf = io.BytesIO()
+        img.save(buf, format="JPEG", quality=self.compression_quality, optimize=True)
+        return base64.b64encode(buf.getvalue()).decode("utf-8")
+    def capture(self) -> Optional[ScreenCapture]:
+        """
+        Capture a screenshot and return base64.
+        Returns:
+            ScreenCapture object or None if failed
+        """
+        if not MSS_AVAILABLE:
+            logger.error("Screen capture not available")
+            return None
+        try:
+            with mss.mss() as sct:
+                mon = sct.monitors[self.monitor]
+                frame = sct.grab(mon)
+                pil = Image.frombytes("RGB", frame.size, frame.bgra, "raw", "BGRX")
+                pil = self._process_image(pil)
+                b64 = self._image_to_base64(pil)
+                return ScreenCapture(
+                    timestamp=time.time(),
+                    image_b64=b64,
+                    width=pil.width,
+                    height=pil.height
+                )
+        except Exception as e:
+            logger.error(f"Screen capture error: {e}")
+            return None
+# Singleton
+_instance: Optional[ScreenService] = None
+def get_screen_service() -> ScreenService:
+    """Get singleton screen service."""
+    global _instance
+    if _instance is None:
+        _instance = ScreenService()
+    return _instance