AlBaraa63's picture
Upload 44 files
32dc112 verified
# πŸ“– API Reference
Complete API documentation for all 8 MissionControlMCP tools.
---
## 1. PDF Reader
### `read_pdf(file_path: str) -> Dict[str, Any]`
Extract text and metadata from PDF files.
**Parameters:**
- `file_path` (str): Absolute path to PDF file
**Returns:**
```python
{
"text": str, # Full text content from all pages
"pages": int, # Number of pages
"metadata": { # Document metadata
"author": str,
"creator": str,
"producer": str,
"subject": str,
"title": str,
"creation_date": str,
"modification_date": str
}
}
```
**Example:**
```python
from tools.pdf_reader import read_pdf
result = read_pdf("C:/docs/report.pdf")
print(f"Pages: {result['pages']}")
print(f"Author: {result['metadata']['author']}")
print(result['text'][:500]) # First 500 chars
```
**Errors:**
- `FileNotFoundError`: PDF file not found
- `ImportError`: PyPDF2 not installed
- `Exception`: Invalid or corrupted PDF
---
### `get_pdf_info(file_path: str) -> Dict[str, Any]`
Get basic PDF information without extracting text.
**Parameters:**
- `file_path` (str): Path to PDF file
**Returns:**
```python
{
"page_count": int,
"is_encrypted": bool,
"file_size_bytes": int,
"file_name": str
}
```
---
## 2. Text Extractor
### `extract_text(text: str, operation: str, **kwargs) -> Dict[str, Any]`
Process and extract information from text.
**Parameters:**
- `text` (str): Input text to process
- `operation` (str): Operation type
- `"clean"` - Remove extra whitespace
- `"summarize"` - Create summary
- `"chunk"` - Split into chunks
- `"keywords"` - Extract keywords
- `**kwargs`: Operation-specific parameters
**Operation: clean**
```python
extract_text(text, operation="clean")
# Returns: {"result": str, "word_count": int}
```
**Operation: summarize**
```python
extract_text(text, operation="summarize", max_length=500)
# max_length: Maximum summary length (default: 500)
# Returns: {"result": str, "word_count": int, "original_length": int}
```
**Operation: chunk**
```python
extract_text(text, operation="chunk", chunk_size=100, overlap=20)
# chunk_size: Characters per chunk (default: 100)
# overlap: Overlapping characters (default: 20)
# Returns: {"chunks": List[str], "chunk_count": int}
```
**Operation: keywords**
```python
extract_text(text, operation="keywords", top_n=10)
# top_n: Number of keywords (default: 10)
# Returns: {"result": str, "keywords": List[str]}
```
**Example:**
```python
from tools.text_extractor import extract_text
# Get keywords
result = extract_text("Your text here...", operation="keywords")
print(result['result']) # "keyword1, keyword2, keyword3"
# Summarize
summary = extract_text("Long text...", operation="summarize", max_length=200)
print(summary['result'])
```
---
## 3. Web Fetcher
### `fetch_web_content(url: str, timeout: int = 30) -> Dict[str, Any]`
Fetch and parse web page content.
**Parameters:**
- `url` (str): Website URL
- `timeout` (int): Request timeout in seconds (default: 30)
**Returns:**
```python
{
"url": str,
"title": str,
"content": str, # Clean text content
"html": str, # Raw HTML
"links": List[str], # All URLs found
"status_code": int, # HTTP status
"timestamp": str
}
```
**Example:**
```python
from tools.web_fetcher import fetch_web_content
result = fetch_web_content("https://example.com")
print(f"Title: {result['title']}")
print(f"Content: {result['content'][:200]}")
print(f"Links found: {len(result['links'])}")
```
**Errors:**
- `requests.exceptions.Timeout`: Request timed out
- `requests.exceptions.RequestException`: Network error
- `Exception`: Invalid URL or parsing error
---
## 4. RAG Search
### `search_documents(query: str, documents: List[str], top_k: int = 3) -> Dict[str, Any]`
Semantic search using vector embeddings and FAISS.
**Parameters:**
- `query` (str): Search query
- `documents` (List[str]): List of documents to search
- `top_k` (int): Number of results to return (default: 3)
**Returns:**
```python
{
"query": str,
"total_documents": int,
"returned_results": int,
"results": [
{
"rank": int,
"document": str,
"score": float, # 0.0 to 1.0 (higher = more relevant)
"distance": float # L2 distance
}
]
}
```
**Example:**
```python
from tools.rag_search import search_documents
docs = [
"Machine learning is a subset of AI",
"Python is a programming language",
"Data science uses statistics"
]
result = search_documents("artificial intelligence", docs, top_k=2)
for item in result['results']:
print(f"Score: {item['score']:.4f} - {item['document']}")
```
**Features:**
- Semantic matching (understands meaning, not just keywords)
- Uses sentence-transformers (all-MiniLM-L6-v2)
- FAISS for fast vector search
---
### `multi_query_search(queries: List[str], documents: List[str], top_k: int = 3) -> Dict[str, Any]`
Search multiple queries at once.
**Returns:**
```python
{
"queries": List[str],
"results": {
"query1": [results],
"query2": [results]
}
}
```
---
## 5. Data Visualizer
### `visualize_data(data: str, chart_type: str, x_column: str = None, y_column: str = None, title: str = "Data Visualization") -> Dict[str, Any]`
Create charts from CSV or JSON data.
**Parameters:**
- `data` (str): CSV or JSON string
- `chart_type` (str): Chart type
- `"bar"` - Bar chart
- `"line"` - Line chart
- `"pie"` - Pie chart
- `"scatter"` - Scatter plot
- `x_column` (str): X-axis column name
- `y_column` (str): Y-axis column name
- `title` (str): Chart title
**Returns:**
```python
{
"image_base64": str, # Base64-encoded PNG image
"dimensions": {
"width": int,
"height": int
},
"chart_type": str,
"title": str,
"columns_used": {
"x": str,
"y": str
}
}
```
**Example:**
```python
from tools.data_visualizer import visualize_data
import base64
csv_data = """month,revenue
Jan,5000000
Feb,5200000
Mar,5400000"""
result = visualize_data(
data=csv_data,
chart_type="line",
x_column="month",
y_column="revenue",
title="Revenue Trends"
)
# Save chart
with open("chart.png", "wb") as f:
f.write(base64.b64decode(result['image_base64']))
```
---
## 6. File Converter
### `convert_file(input_path: str, output_path: str, conversion_type: str) -> Dict[str, Any]`
Convert between PDF, TXT, and CSV formats.
**Parameters:**
- `input_path` (str): Input file path
- `output_path` (str): Output file path
- `conversion_type` (str): Conversion type
- `"pdf_to_txt"` - PDF β†’ Text
- `"txt_to_pdf"` - Text β†’ PDF
- `"csv_to_txt"` - CSV β†’ Text
- `"txt_to_csv"` - Text β†’ CSV
**Returns:**
```python
{
"success": bool,
"input_file": str,
"output_file": str,
"conversion_type": str,
"file_size_bytes": int
}
```
**Example:**
```python
from tools.file_converter import convert_file
result = convert_file(
input_path="document.pdf",
output_path="document.txt",
conversion_type="pdf_to_txt"
)
print(f"Converted: {result['success']}")
print(f"Output: {result['output_file']}")
```
---
## 7. Email Intent Classifier
### `classify_email_intent(email_text: str) -> Dict[str, Any]`
Classify email intent using NLP pattern matching.
**Parameters:**
- `email_text` (str): Email content (subject + body)
**Returns:**
```python
{
"intent": str, # Primary intent
"confidence": float, # 0.0 to 1.0
"secondary_intents": [
{
"intent": str,
"confidence": float
}
],
"explanation": str
}
```
**Intent Types:**
- `complaint` - Customer complaints
- `inquiry` - Information requests
- `request` - Action requests
- `feedback` - Suggestions/reviews
- `order` - Purchase-related
- `meeting` - Meeting scheduling
- `urgent` - High priority issues
- `application` - Job applications
- `sales` - Sales pitches
- `other` - Unclassified
**Example:**
```python
from tools.email_intent_classifier import classify_email_intent
email = """
Subject: Order Issue
My order #12345 hasn't arrived yet. Can you help?
"""
result = classify_email_intent(email)
print(f"Intent: {result['intent']}") # "complaint"
print(f"Confidence: {result['confidence']}") # 0.85
```
---
### `classify_batch(emails: List[str]) -> Dict[str, Any]`
Classify multiple emails at once.
**Returns:**
```python
{
"results": [
{"email_index": int, "intent": str, "confidence": float},
...
],
"total_processed": int
}
```
---
## 8. KPI Generator
### `generate_kpis(data: str, metrics: List[str] = None) -> Dict[str, Any]`
Calculate business KPIs from financial data.
**Parameters:**
- `data` (str): JSON string with business data
- `metrics` (List[str]): Metric categories (optional)
- `"revenue"` - Revenue-related KPIs
- `"growth"` - Growth rates
- `"efficiency"` - Efficiency metrics
- `"customer"` - Customer metrics
- `"operational"` - Operational metrics
**Input Data Format:**
```json
{
"revenue": 5000000,
"costs": 3000000,
"customers": 2500,
"current_revenue": 5000000,
"previous_revenue": 4500000,
"current_customers": 2500,
"previous_customers": 2300,
"employees": 50,
"marketing_spend": 500000,
"sales": 5000000,
"cogs": 2000000
}
```
**Returns:**
```python
{
"kpis": {
"total_revenue": float,
"profit": float,
"profit_margin_percent": float,
"revenue_growth": float,
"revenue_per_customer": float,
"revenue_per_employee": float,
"customer_growth_rate": float,
...
},
"summary": str, # Executive summary
"trends": List[str], # Identified trends
"metrics_analyzed": List[str],
"data_points": int
}
```
**Example:**
```python
from tools.kpi_generator import generate_kpis
import json
data = {
"revenue": 5000000,
"costs": 3000000,
"customers": 2500,
"employees": 50
}
result = generate_kpis(json.dumps(data), metrics=["revenue", "efficiency"])
print(f"Profit: ${result['kpis']['profit']:,.0f}")
print(f"Margin: {result['kpis']['profit_margin_percent']:.1f}%")
print(f"\nSummary: {result['summary']}")
```
---
## Error Handling
All tools follow consistent error handling:
```python
try:
result = tool_function(params)
except FileNotFoundError as e:
print(f"File not found: {e}")
except ValueError as e:
print(f"Invalid input: {e}")
except ImportError as e:
print(f"Missing dependency: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
```
---
## Type Hints
All functions use Python type hints:
```python
from typing import Dict, Any, List
def function_name(param: str) -> Dict[str, Any]:
...
```
---
## Logging
All tools use Python logging:
```python
import logging
logger = logging.getLogger(__name__)
logger.info("Operation completed")
logger.warning("Warning message")
logger.error("Error occurred")
```
---
## Dependencies
See `requirements.txt` for all dependencies:
```txt
mcp>=1.0.0
pypdf2>=3.0.0
requests>=2.31.0
beautifulsoup4>=4.12.0
pandas>=2.0.0
numpy>=1.24.0
matplotlib>=3.7.0
seaborn>=0.12.0
scikit-learn>=1.3.0
nltk>=3.8.0
pydantic>=2.0.0
faiss-cpu>=1.7.4
sentence-transformers>=2.2.0
```
---
## MCP Integration
All tools are registered in `mcp_server.py`:
```python
server.register_tool(
name="pdf_reader",
description="Extract text and metadata from PDF files",
input_schema={
"type": "object",
"properties": {
"file_path": {"type": "string"}
},
"required": ["file_path"]
}
)
```
---
## Version Information
- **API Version:** 1.0.0
- **Python:** 3.8+
- **MCP Protocol:** 1.0.0
---
## Support
For issues or questions:
- GitHub: AlBaraa-1/CleanEye-Hackathon
- Documentation: README.md
- Examples: EXAMPLES.md
- Testing: TESTING.md
**Complete API reference for MissionControlMCP!** πŸš€