Spaces:

MCP-1st-Birthday
/

MissionControlMCP

Running

File size: 12,056 Bytes

c3de917

# 📖 API Reference

Complete API documentation for all 8 MissionControlMCP tools.

---

## 1. PDF Reader

### `read_pdf(file_path: str) -> Dict[str, Any]`

Extract text and metadata from PDF files.

**Parameters:**
- `file_path` (str): Absolute path to PDF file

**Returns:**
```python
{
    "text": str,           # Full text content from all pages
    "pages": int,          # Number of pages
    "metadata": {          # Document metadata
        "author": str,
        "creator": str,
        "producer": str,
        "subject": str,
        "title": str,
        "creation_date": str,
        "modification_date": str
    }
}
```

**Example:**
```python
from tools.pdf_reader import read_pdf

result = read_pdf("C:/docs/report.pdf")
print(f"Pages: {result['pages']}")
print(f"Author: {result['metadata']['author']}")
print(result['text'][:500])  # First 500 chars
```

**Errors:**
- `FileNotFoundError`: PDF file not found
- `ImportError`: PyPDF2 not installed
- `Exception`: Invalid or corrupted PDF

---

### `get_pdf_info(file_path: str) -> Dict[str, Any]`

Get basic PDF information without extracting text.

**Parameters:**
- `file_path` (str): Path to PDF file

**Returns:**
```python
{
    "page_count": int,
    "is_encrypted": bool,
    "file_size_bytes": int,
    "file_name": str
}
```

---

## 2. Text Extractor

### `extract_text(text: str, operation: str, **kwargs) -> Dict[str, Any]`

Process and extract information from text.

**Parameters:**
- `text` (str): Input text to process
- `operation` (str): Operation type
  - `"clean"` - Remove extra whitespace
  - `"summarize"` - Create summary
  - `"chunk"` - Split into chunks
  - `"keywords"` - Extract keywords
- `**kwargs`: Operation-specific parameters

**Operation: clean**
```python
extract_text(text, operation="clean")
# Returns: {"result": str, "word_count": int}
```

**Operation: summarize**
```python
extract_text(text, operation="summarize", max_length=500)
# max_length: Maximum summary length (default: 500)
# Returns: {"result": str, "word_count": int, "original_length": int}
```

**Operation: chunk**
```python
extract_text(text, operation="chunk", chunk_size=100, overlap=20)
# chunk_size: Characters per chunk (default: 100)
# overlap: Overlapping characters (default: 20)
# Returns: {"chunks": List[str], "chunk_count": int}
```

**Operation: keywords**
```python
extract_text(text, operation="keywords", top_n=10)
# top_n: Number of keywords (default: 10)
# Returns: {"result": str, "keywords": List[str]}
```

**Example:**
```python
from tools.text_extractor import extract_text

# Get keywords
result = extract_text("Your text here...", operation="keywords")
print(result['result'])  # "keyword1, keyword2, keyword3"

# Summarize
summary = extract_text("Long text...", operation="summarize", max_length=200)
print(summary['result'])
```

---

## 3. Web Fetcher

### `fetch_web_content(url: str, timeout: int = 30) -> Dict[str, Any]`

Fetch and parse web page content.

**Parameters:**
- `url` (str): Website URL
- `timeout` (int): Request timeout in seconds (default: 30)

**Returns:**
```python
{
    "url": str,
    "title": str,
    "content": str,         # Clean text content
    "html": str,            # Raw HTML
    "links": List[str],     # All URLs found
    "status_code": int,     # HTTP status
    "timestamp": str
}
```

**Example:**
```python
from tools.web_fetcher import fetch_web_content

result = fetch_web_content("https://example.com")
print(f"Title: {result['title']}")
print(f"Content: {result['content'][:200]}")
print(f"Links found: {len(result['links'])}")
```

**Errors:**
- `requests.exceptions.Timeout`: Request timed out
- `requests.exceptions.RequestException`: Network error
- `Exception`: Invalid URL or parsing error

---

## 4. RAG Search

### `search_documents(query: str, documents: List[str], top_k: int = 3) -> Dict[str, Any]`

Semantic search using vector embeddings and FAISS.

**Parameters:**
- `query` (str): Search query
- `documents` (List[str]): List of documents to search
- `top_k` (int): Number of results to return (default: 3)

**Returns:**
```python
{
    "query": str,
    "total_documents": int,
    "returned_results": int,
    "results": [
        {
            "rank": int,
            "document": str,
            "score": float,      # 0.0 to 1.0 (higher = more relevant)
            "distance": float    # L2 distance
        }
    ]
}
```

**Example:**
```python
from tools.rag_search import search_documents

docs = [
    "Machine learning is a subset of AI",
    "Python is a programming language",
    "Data science uses statistics"
]

result = search_documents("artificial intelligence", docs, top_k=2)

for item in result['results']:
    print(f"Score: {item['score']:.4f} - {item['document']}")
```

**Features:**
- Semantic matching (understands meaning, not just keywords)
- Uses sentence-transformers (all-MiniLM-L6-v2)
- FAISS for fast vector search

---

### `multi_query_search(queries: List[str], documents: List[str], top_k: int = 3) -> Dict[str, Any]`

Search multiple queries at once.

**Returns:**
```python
{
    "queries": List[str],
    "results": {
        "query1": [results],
        "query2": [results]
    }
}
```

---

## 5. Data Visualizer

### `visualize_data(data: str, chart_type: str, x_column: str = None, y_column: str = None, title: str = "Data Visualization") -> Dict[str, Any]`

Create charts from CSV or JSON data.

**Parameters:**
- `data` (str): CSV or JSON string
- `chart_type` (str): Chart type
  - `"bar"` - Bar chart
  - `"line"` - Line chart
  - `"pie"` - Pie chart
  - `"scatter"` - Scatter plot
- `x_column` (str): X-axis column name
- `y_column` (str): Y-axis column name
- `title` (str): Chart title

**Returns:**
```python
{
    "image_base64": str,     # Base64-encoded PNG image
    "dimensions": {
        "width": int,
        "height": int
    },
    "chart_type": str,
    "title": str,
    "columns_used": {
        "x": str,
        "y": str
    }
}
```

**Example:**
```python
from tools.data_visualizer import visualize_data
import base64

csv_data = """month,revenue
Jan,5000000
Feb,5200000
Mar,5400000"""

result = visualize_data(
    data=csv_data,
    chart_type="line",
    x_column="month",
    y_column="revenue",
    title="Revenue Trends"
)

# Save chart
with open("chart.png", "wb") as f:
    f.write(base64.b64decode(result['image_base64']))
```

---

## 6. File Converter

### `convert_file(input_path: str, output_path: str, conversion_type: str) -> Dict[str, Any]`

Convert between PDF, TXT, and CSV formats.

**Parameters:**
- `input_path` (str): Input file path
- `output_path` (str): Output file path
- `conversion_type` (str): Conversion type
  - `"pdf_to_txt"` - PDF → Text
  - `"txt_to_pdf"` - Text → PDF
  - `"csv_to_txt"` - CSV → Text
  - `"txt_to_csv"` - Text → CSV

**Returns:**
```python
{
    "success": bool,
    "input_file": str,
    "output_file": str,
    "conversion_type": str,
    "file_size_bytes": int
}
```

**Example:**
```python
from tools.file_converter import convert_file

result = convert_file(
    input_path="document.pdf",
    output_path="document.txt",
    conversion_type="pdf_to_txt"
)

print(f"Converted: {result['success']}")
print(f"Output: {result['output_file']}")
```

---

## 7. Email Intent Classifier

### `classify_email_intent(email_text: str) -> Dict[str, Any]`

Classify email intent using NLP pattern matching.

**Parameters:**
- `email_text` (str): Email content (subject + body)

**Returns:**
```python
{
    "intent": str,          # Primary intent
    "confidence": float,    # 0.0 to 1.0
    "secondary_intents": [
        {
            "intent": str,
            "confidence": float
        }
    ],
    "explanation": str
}
```

**Intent Types:**
- `complaint` - Customer complaints
- `inquiry` - Information requests
- `request` - Action requests
- `feedback` - Suggestions/reviews
- `order` - Purchase-related
- `meeting` - Meeting scheduling
- `urgent` - High priority issues
- `application` - Job applications
- `sales` - Sales pitches
- `other` - Unclassified

**Example:**
```python
from tools.email_intent_classifier import classify_email_intent

email = """
Subject: Order Issue
My order #12345 hasn't arrived yet. Can you help?
"""

result = classify_email_intent(email)
print(f"Intent: {result['intent']}")          # "complaint"
print(f"Confidence: {result['confidence']}")  # 0.85
```

---

### `classify_batch(emails: List[str]) -> Dict[str, Any]`

Classify multiple emails at once.

**Returns:**
```python
{
    "results": [
        {"email_index": int, "intent": str, "confidence": float},
        ...
    ],
    "total_processed": int
}
```

---

## 8. KPI Generator

### `generate_kpis(data: str, metrics: List[str] = None) -> Dict[str, Any]`

Calculate business KPIs from financial data.

**Parameters:**
- `data` (str): JSON string with business data
- `metrics` (List[str]): Metric categories (optional)
  - `"revenue"` - Revenue-related KPIs
  - `"growth"` - Growth rates
  - `"efficiency"` - Efficiency metrics
  - `"customer"` - Customer metrics
  - `"operational"` - Operational metrics

**Input Data Format:**
```json
{
    "revenue": 5000000,
    "costs": 3000000,
    "customers": 2500,
    "current_revenue": 5000000,
    "previous_revenue": 4500000,
    "current_customers": 2500,
    "previous_customers": 2300,
    "employees": 50,
    "marketing_spend": 500000,
    "sales": 5000000,
    "cogs": 2000000
}
```

**Returns:**
```python
{
    "kpis": {
        "total_revenue": float,
        "profit": float,
        "profit_margin_percent": float,
        "revenue_growth": float,
        "revenue_per_customer": float,
        "revenue_per_employee": float,
        "customer_growth_rate": float,
        ...
    },
    "summary": str,              # Executive summary
    "trends": List[str],         # Identified trends
    "metrics_analyzed": List[str],
    "data_points": int
}
```

**Example:**
```python
from tools.kpi_generator import generate_kpis
import json

data = {
    "revenue": 5000000,
    "costs": 3000000,
    "customers": 2500,
    "employees": 50
}

result = generate_kpis(json.dumps(data), metrics=["revenue", "efficiency"])

print(f"Profit: ${result['kpis']['profit']:,.0f}")
print(f"Margin: {result['kpis']['profit_margin_percent']:.1f}%")
print(f"\nSummary: {result['summary']}")
```

---

## Error Handling

All tools follow consistent error handling:

```python
try:
    result = tool_function(params)
except FileNotFoundError as e:
    print(f"File not found: {e}")
except ValueError as e:
    print(f"Invalid input: {e}")
except ImportError as e:
    print(f"Missing dependency: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")
```

---

## Type Hints

All functions use Python type hints:

```python
from typing import Dict, Any, List

def function_name(param: str) -> Dict[str, Any]:
    ...
```

---

## Logging

All tools use Python logging:

```python
import logging
logger = logging.getLogger(__name__)

logger.info("Operation completed")
logger.warning("Warning message")
logger.error("Error occurred")
```

---

## Dependencies

See `requirements.txt` for all dependencies:

```txt
mcp>=1.0.0
pypdf2>=3.0.0
requests>=2.31.0
beautifulsoup4>=4.12.0
pandas>=2.0.0
numpy>=1.24.0
matplotlib>=3.7.0
seaborn>=0.12.0
scikit-learn>=1.3.0
nltk>=3.8.0
pydantic>=2.0.0
faiss-cpu>=1.7.4
sentence-transformers>=2.2.0
```

---

## MCP Integration

All tools are registered in `mcp_server.py`:

```python
server.register_tool(
    name="pdf_reader",
    description="Extract text and metadata from PDF files",
    input_schema={
        "type": "object",
        "properties": {
            "file_path": {"type": "string"}
        },
        "required": ["file_path"]
    }
)
```

---

## Version Information

- **API Version:** 1.0.0
- **Python:** 3.8+
- **MCP Protocol:** 1.0.0

---

## Support

For issues or questions:
- GitHub: AlBaraa-1/CleanEye-Hackathon
- Documentation: README.md
- Examples: EXAMPLES.md
- Testing: TESTING.md

**Complete API reference for MissionControlMCP!** 🚀