# MCP Architecture Documentation ## Overview This document explains the Model Context Protocol (MCP) architecture used in the Competitive Analysis Agent system. ## What is the Model Context Protocol? MCP is a standardized protocol designed to enable seamless integration of: - **AI Models** (Claude, GPT-4, etc.) with - **External Tools & Services** (web search, databases, APIs, etc.) - **Custom Business Logic** (analysis, validation, report generation) ### Why MCP? 1. **Modularity**: Tools are isolated and reusable 2. **Scalability**: Add tools without modifying core agent code 3. **Standardization**: Common protocol across different AI systems 4. **Separation of Concerns**: Clear boundaries between reasoning and action 5. **Production Ready**: Built for enterprise-grade AI applications --- ## System Architecture ### Three-Tier Architecture ``` ┌─────────────────────────────────────────────────────┐ │ PRESENTATION LAYER - Gradio UI │ │ • User input (company name, API key) │ │ • Report display (formatted Markdown) │ │ • Error handling and validation │ └──────────────────┬──────────────────────────────────┘ │ HTTP/REST ▼ ┌─────────────────────────────────────────────────────┐ │ APPLICATION LAYER - MCP Client │ │ • OpenAI Agent (GPT-4) │ │ • Strategic reasoning and planning │ │ • Tool orchestration and sequencing │ │ • Result synthesis │ └──────────────────┬──────────────────────────────────┘ │ MCP Protocol ▼ ┌─────────────────────────────────────────────────────┐ │ SERVICE LAYER - MCP Server (FastMCP) │ │ Tools: │ │ • validate_company() │ │ • identify_sector() │ │ • identify_competitors() │ │ • browse_page() │ │ • generate_report() │ │ │ │ External Services: │ │ • DuckDuckGo API │ │ • HTTP/BeautifulSoup scraping │ │ • OpenAI API (GPT-4) │ └─────────────────────────────────────────────────────┘ ``` --- ## Component Details ### 1. Presentation Layer (`app.py`) **Gradio Interface** - User-friendly web UI - Input validation - Output formatting - Error messaging ```python # Example flow User Input: "Tesla" ↓ Validate inputs ↓ Call MCP Client.analyze_company() ↓ Display Markdown report ``` ### 2. Application Layer (`mcp_client.py`) **MCP Client with OpenAI Agent** The client implements: - **System Prompt**: Defines agent role and goals - **Message History**: Maintains conversation context - **Tool Calling**: Translates agent decisions to MCP calls - **Response Synthesis**: Compiles results into reports ```python system_prompt = """ You are a competitive analysis expert. Use available tools to: 1. Validate the company 2. Identify sector 3. Find competitors 4. Gather strategic data 5. Generate insights """ # Agent workflow: messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": "Analyze Sony"} ] response = client.chat.completions.create( model="gpt-4", messages=messages ) # OpenAI returns tool calls, which we execute ``` **Key Features**: - Graceful fallback to simple analysis when MCP unavailable - Handles API errors and timeouts - Synthesizes multiple tool results ### 3. Service Layer (`mcp_server.py`) **FastMCP Server with Tools** #### Tools Overview | Tool | Purpose | Returns | |------|---------|---------| | `validate_company(name)` | Check if company exists | Bool + evidence | | `identify_sector(name)` | Find industry classification | Sector name | | `identify_competitors(sector, company)` | Discover top 3 rivals | "Comp1, Comp2, Comp3" | | `browse_page(url, instructions)` | Extract webpage content | Relevant text | | `generate_report(company, context)` | Create analysis report | Markdown report | #### Tool Implementation Pattern ```python @mcp.tool() def validate_company(company_name: str) -> str: """ Docstring: Describes tool purpose and parameters """ # Implementation try: results = web_search_tool(f"{company_name} company") evidence_count = analyze_search_results(results) return validation_result except Exception as e: return f"Error: {str(e)}" ``` #### Web Search Integration ```python from duckduckgo_search import DDGS def web_search_tool(query: str) -> str: """Unified search interface for all tools""" with DDGS() as ddgs: results = list(ddgs.text(query, max_results=5)) return format_results(results) ``` --- ## Message Flow ### Complete Analysis Request ``` 1. USER INTERFACE (Gradio) │ ├─ Company: "Apple" └─ OpenAI Key: "sk-..." 2. GRADIO → MCP CLIENT │ ├─ analyze_competitor_landscape("Apple", api_key) │ └─ Creates CompetitiveAnalysisAgent instance 3. MCP CLIENT → OPENAI │ ├─ System: "You are a competitive analyst..." ├─ User: "Analyze Apple's competitors" │ ├─ OpenAI responds with: │ └─ "Call validate_company('Apple')" 4. MCP CLIENT → MCP SERVER │ ├─ Calls: validate_company("Apple") ├─ Calls: identify_sector("Apple") ├─ Calls: identify_competitors("Technology", "Apple") │ └─ Receives results for each tool 5. MCP SERVER │ ├─ validate_company() │ └─ Web search → DuckDuckGo API → Parse results │ ├─ identify_sector() │ └─ Multi-stage search → Keyword analysis → Return sector │ ├─ identify_competitors() │ └─ Industry search → Competitor extraction → Ranking │ └─ generate_report() └─ Format results → Markdown template → Return report 6. MCP CLIENT SYNTHESIS │ ├─ Compile all tool results ├─ Add OpenAI insights └─ Return complete report 7. GRADIO DISPLAY │ └─ Render Markdown report to user ``` --- ## Data Flow Diagram ``` USER INPUT │ ├─ company_name: "Company X" └─ api_key: "sk-xxx" │ ▼ ┌──────────────────────┐ │ Input Validation │ │ (Length, Format) │ └──────────┬───────────┘ │ ▼ ┌──────────────────────────────┐ │ OpenAI Agent Planning │ │ (System + User Messages) │ └──────────┬───────────────────┘ │ ├─────────────────────────────┬─────────────────────────┬──────────────┐ │ │ │ │ ▼ ▼ ▼ ▼ ┌────────────────┐ ┌──────────────────┐ ┌──────────────┐ ┌─────────┐ │ validate_ │ │ identify_ │ │ identify_ │ │ browse_ │ │ company() │ │ sector() │ │ competitors()│ │ page() │ └────────┬───────┘ └────────┬─────────┘ └──────┬───────┘ └────┬────┘ │ │ │ │ ▼ ▼ ▼ ▼ ┌──────────────┐ ┌─────────────┐ ┌────────────┐ ┌──────────┐ │ Web Search │ │ Web Search │ │ Web Search │ │ HTTP Get │ │ DuckDuckGo │ │ Multi-stage │ │ Industry │ │ Parse │ │ + Analysis │ │ │ │ Leaders │ │ HTML │ └──────┬───────┘ └──────┬──────┘ └─────┬──────┘ └────┬─────┘ │ │ │ │ ▼ ▼ ▼ ▼ VALIDATION → SECTOR ID → COMPETITORS → ADDITIONAL DATA │ │ │ │ └──────────────┴──────────────┴──────────────────┘ │ ▼ ┌─────────────────────┐ │ generate_report() │ │ (Compile results) │ └────────┬────────────┘ │ ▼ ┌─────────────────────┐ │ OpenAI Final │ │ Synthesis │ └────────┬────────────┘ │ ▼ FINAL REPORT (Markdown format) ``` --- ## Tool Implementation Details ### Tool 1: `validate_company()` ```python # Multi-stage validation search_results = web_search_tool("Tesla company business official site") # Evidence signals: ✓ Official website found (.com/.io) ✓ "Official site" or "official website" mention ✓ Company + sector description ✓ Business terminology present ✓ Wikipedia/news mentions # Result: Evidence count >= 2 → Valid company ``` ### Tool 2: `identify_sector()` ```python # Three search strategies: 1. "What does Tesla do?" → Extract sector keywords 2. "Tesla industry type" → Direct classification 3. "Tesla sector news" → Financial/news sources # Sector patterns: { "Technology": ["software", "hardware", "cloud", "ai", ...], "Finance": ["banking", "fintech", "insurance", ...], "Manufacturing": ["automotive", "industrial", ...], ... } # Weighted voting to determine primary sector ``` ### Tool 3: `identify_competitors()` ```python # Search strategy: 1. "Top technology companies" → Market leaders 2. "Tesla competitors" → Direct rivals 3. "EV industry leaders" → Sector players # Extraction methods: - Pattern matching for company names - List parsing (comma-separated, bulleted) - Frequency analysis and ranking # Returns: Top 3 ranked competitors ``` ### Tool 4: `browse_page()` ```python # Content extraction workflow: requests.get(url) → BeautifulSoup parsing → Remove scripts/styles/headers/footers → Extract main content divs/articles/paragraphs → Keyword matching against instructions → Return top N relevant sentences # Safety: Timeout=10s, max_content=5000 chars ``` ### Tool 5: `generate_report()` ```python # Template-based report generation report = f""" # Competitive Analysis Report: {company_name} ## Executive Summary [Synthesized findings] ## Competitor Comparison | Competitor | Strategy | Pricing | Products | Market | |------------|----------|---------|----------|--------| | [extracted competitors] | - | - | - | - | ## Strategic Insights [Recommendations] """ ``` --- ## Error Handling Strategy ### Layered Error Handling ``` Layer 1: Input Validation (Gradio) └─ Check company name length └─ Validate API key format └─ Return user-friendly error Layer 2: Tool Execution (MCP Server) └─ Try/except on each tool └─ Timeout protection (10s requests) └─ Graceful degradation └─ Log detailed errors Layer 3: Agent Logic (MCP Client) └─ API timeout handling └─ Rate limit handling └─ Fallback to simple analysis └─ Return partial results Layer 4: User Feedback (Gradio) └─ Display error with context └─ Suggest remediation └─ Allow retry ``` --- ## Performance Optimization ### Caching Strategy ```python # Web search results cached for 5 minutes # Sector identify, re-used across tools # Competitor list, reused in reports ``` ### Parallel Tool Execution ```python # Future enhancement: Run independent tools in parallel validate_company() (parallel) identify_sector() (parallel) identify_competitors() (sequential, depends on sector) ``` ### Rate Limiting ```python # DuckDuckGo: 2.0 second delays between searches # OpenAI: Batched requests, monitoring quota # HTTP: 10-second timeout, connection pooling ``` --- ## Security Considerations ### API Key Handling ```python # Keys accepted via: ✓ UI input field (temporary in memory) ✗ NOT stored in files ✗ NOT logged in output ✗ NOT persisted in database # Environment variables optional: Optional: Load from .env via python-dotenv ``` ### Data Privacy ```python # Web search results: Temporary, discarded after analysis # Company data: Not cached or stored # User queries: Not logged or tracked # Report generation: All local processing ``` ### Web Scraping Safety ```python # User-Agent provided (genuine browser identification) # Robots.txt respected (DuckDuckGo + BeautifulSoup) # Timeout protection (10 seconds) # Error handling for blocked requests ``` --- ## Extension Points ### Adding New Tools ```python @mcp.tool() def custom_tool(param1: str, param2: int) -> str: """ Your custom tool description. Args: param1: Parameter 1 description param2: Parameter 2 description Returns: str: Result description """ try: # Implementation result = some_operation(param1, param2) return result except Exception as e: return f"Error: {str(e)}" ``` ### Modifying Agent Behavior ```python # In mcp_client.py, edit system_prompt: system_prompt = """ Updated instructions for agent behavior """ # Or add initial human message: messages.append({ "role": "user", "content": "Additional analysis request..." }) ``` ### Customizing Report Generation ```python # In mcp_server.py, edit generate_report() template: report = f""" # Custom Report Format Your custom structure here... """ ``` --- ## Testing ### Manual Testing ```bash # Test MCP Server python mcp_server.py # Test MCP Client functions python -c "from mcp_client import analyze_competitor_landscape; print(analyze_competitor_landscape('Microsoft', 'sk-...'))" # Test Gradio UI python app.py # Navigate to http://localhost:7860 ``` ### Validation Tests ```python # Test validate_company() assert "VALID" in validate_company("Google") assert "NOT" in validate_company("FakeCompanyXYZ123") # Test identify_sector() assert "Technology" in identify_sector("Microsoft") assert "Finance" in identify_sector("JPMorgan") # Test competitor discovery competitors = identify_competitors("Technology", "Google") assert len(competitors) <= 3 ``` --- ## Future Enhancements 1. **Real-time Market Data**: Integrate financial APIs (Alpha Vantage, etc.) 2. **Sentiment Analysis**: Analyze news sentiment about companies 3. **Patent Analysis**: Include R&D insights from patents 4. **Social Media**: Monitor competitor social media activity 5. **Pricing Intelligence**: Track price changes over time 6. **SWOT Matrix**: Generate structured SWOT analysis 7. **Visualization**: Create charts and graphs 8. **PDF Export**: Generate PDF reports 9. **Multi-company Batch**: Analyze multiple companies 10. **Integration APIs**: Connect to Slack, Salesforce, etc. --- ## Conclusion The MCP architecture provides: - ✅ Modularity and extensibility - ✅ Clear separation of concerns - ✅ Robust error handling - ✅ Scalability for future enhancements - ✅ Production-ready design - ✅ Easy tool management This design enables rapid development, maintenance, and deployment of AI-powered competitive analysis systems. --- **Document Version**: 1.0 **Last Updated**: March 2026