Spaces:

msse-team-3
/

ai-engineering-project

Sleeping

File size: 15,440 Bytes

f884e6e

# API Documentation - HuggingFace Edition

## Overview

PolicyWise provides a RESTful API for corporate policy question-answering using HuggingFace free-tier services. All endpoints return JSON responses and support CORS for web integration.

## Base URL

- **Local Development**: `http://localhost:5000`
- **HuggingFace Spaces**: `https://your-username-policywise-rag.hf.space`

## Authentication

No authentication required for public deployment. For production use, consider implementing API key authentication.

## Core Endpoints

### Chat Endpoint (Primary Interface)

**POST /chat**

Ask questions about company policies and receive intelligent responses with automatic source citations.

#### Request

```http
POST /chat
Content-Type: application/json

{
  "message": "What is the remote work policy for new employees?",
  "max_tokens": 500,
  "include_sources": true,
  "guardrails_level": "standard"
}
```

#### Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `message` | string | Yes | - | User question about company policies |
| `max_tokens` | integer | No | 500 | Maximum response length (100-1000) |
| `include_sources` | boolean | No | true | Include source document details |
| `guardrails_level` | string | No | "standard" | Safety level: "strict", "standard", "relaxed" |

#### Response

```json
{
  "status": "success",
  "message": "What is the remote work policy for new employees?",
  "response": "New employees are eligible for remote work after completing their initial 90-day onboarding period. During this period, they must work from the office to facilitate mentoring and team integration. After the probationary period, employees can work remotely up to 3 days per week, subject to manager approval and role requirements. [Source: remote_work_policy.md] [Source: employee_handbook.md]",
  "confidence": 0.91,
  "sources": [
    {
      "filename": "remote_work_policy.md",
      "chunk_id": "remote_work_policy_chunk_3",
      "relevance_score": 0.89,
      "content_preview": "New employees must complete a 90-day onboarding period..."
    },
    {
      "filename": "employee_handbook.md",
      "chunk_id": "employee_handbook_chunk_7",
      "relevance_score": 0.76,
      "content_preview": "Remote work eligibility requirements include..."
    }
  ],
  "response_time_ms": 2340,
  "guardrails": {
    "safety_score": 0.98,
    "quality_score": 0.91,
    "citation_count": 2
  },
  "services_used": {
    "embedding_model": "intfloat/multilingual-e5-large",
    "llm_model": "meta-llama/Meta-Llama-3-8B-Instruct",
    "vector_store": "huggingface_dataset"
  }
}
```

#### Error Response

```json
{
  "status": "error",
  "error": "Request too long",
  "message": "Message exceeds maximum character limit of 5000",
  "error_code": "MESSAGE_TOO_LONG"
}
```

### Search Endpoint

**POST /search**

Perform semantic search across policy documents using HuggingFace embeddings.

#### Request

```http
POST /search
Content-Type: application/json

{
  "query": "What is the remote work policy?",
  "top_k": 5,
  "threshold": 0.3,
  "include_metadata": true
}
```

#### Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `query` | string | Yes | - | Search query text |
| `top_k` | integer | No | 5 | Number of results to return (1-20) |
| `threshold` | float | No | 0.3 | Minimum similarity threshold (0.0-1.0) |
| `include_metadata` | boolean | No | true | Include document metadata |

#### Response

```json
{
  "status": "success",
  "query": "What is the remote work policy?",
  "results_count": 3,
  "embedding_model": "intfloat/multilingual-e5-large",
  "embedding_dimensions": 1024,
  "results": [
    {
      "chunk_id": "remote_work_policy_chunk_2",
      "content": "Employees may work remotely up to 3 days per week with manager approval. Remote work arrangements must be documented and reviewed quarterly.",
      "similarity_score": 0.87,
      "metadata": {
        "source_file": "remote_work_policy.md",
        "chunk_index": 2,
        "category": "HR",
        "word_count": 95,
        "created_at": "2025-10-25T10:30:00Z"
      }
    },
    {
      "chunk_id": "remote_work_policy_chunk_1",
      "content": "Remote work eligibility requires completion of probationary period and manager approval. New employees must work on-site for first 90 days.",
      "similarity_score": 0.82,
      "metadata": {
        "source_file": "remote_work_policy.md",
        "chunk_index": 1,
        "category": "HR",
        "word_count": 88,
        "created_at": "2025-10-25T10:30:00Z"
      }
    }
  ],
  "search_time_ms": 234,
  "vector_store_size": 98
}
```

### Document Processing

**POST /process-documents**

Process and embed policy documents using HuggingFace services (automatically run on startup).

#### Request

```http
POST /process-documents
Content-Type: application/json

{
  "force_reprocess": false,
  "batch_size": 10
}
```

#### Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `force_reprocess` | boolean | No | false | Force reprocessing even if documents exist |
| `batch_size` | integer | No | 10 | Number of documents to process per batch |

#### Response

```json
{
  "status": "success",
  "processing_details": {
    "files_processed": 22,
    "chunks_generated": 98,
    "embeddings_created": 98,
    "processing_time_seconds": 18.7
  },
  "embedding_service": {
    "model": "intfloat/multilingual-e5-large",
    "dimensions": 1024,
    "api_status": "operational"
  },
  "vector_store": {
    "type": "huggingface_dataset",
    "dataset_name": "policy-vectors",
    "total_embeddings": 98,
    "storage_size_mb": 2.4
  },
  "corpus_statistics": {
    "total_words": 10637,
    "average_chunk_size": 95,
    "documents_by_category": {
      "HR": 8,
      "Finance": 4,
      "Security": 3,
      "Operations": 4,
      "EHS": 3
    }
  },
  "quality_metrics": {
    "embedding_generation_success_rate": 1.0,
    "average_embedding_time_ms": 450,
    "metadata_completeness": 1.0
  }
}
```

### Health Check

**GET /health**

Comprehensive system health check including all HuggingFace services.

#### Request

```http
GET /health
```

#### Response

```json
{
  "status": "healthy",
  "timestamp": "2025-10-25T10:30:00Z",
  "services": {
    "hf_embedding_api": "operational",
    "hf_inference_api": "operational",
    "hf_dataset_store": "operational"
  },
  "service_details": {
    "embedding_api": {
      "model": "intfloat/multilingual-e5-large",
      "last_request_ms": 450,
      "requests_today": 247,
      "error_rate": 0.02
    },
    "inference_api": {
      "model": "meta-llama/Meta-Llama-3-8B-Instruct",
      "last_request_ms": 2340,
      "requests_today": 89,
      "error_rate": 0.01
    },
    "dataset_store": {
      "dataset_name": "policy-vectors",
      "total_embeddings": 98,
      "last_updated": "2025-10-25T09:15:00Z",
      "access_status": "operational"
    }
  },
  "configuration": {
    "use_openai_embedding": false,
    "hf_token_configured": true,
    "embedding_model": "intfloat/multilingual-e5-large",
    "embedding_dimensions": 1024,
    "deployment_platform": "huggingface_spaces"
  },
  "statistics": {
    "total_documents": 98,
    "total_queries_processed": 1247,
    "average_response_time_ms": 2140,
    "vector_store_size": 98,
    "uptime_hours": 72.5
  },
  "performance": {
    "memory_usage_mb": 156,
    "cpu_usage_percent": 12,
    "disk_usage_mb": 45,
    "cache_hit_rate": 0.78
  }
}
```

### System Information

**GET /**

Welcome page with system information and capabilities.

#### Response

```json
{
  "message": "Welcome to PolicyWise - HuggingFace Edition",
  "version": "2.0.0-hf",
  "description": "Corporate policy RAG system powered by HuggingFace free-tier services",
  "capabilities": [
    "Policy question answering with citations",
    "Semantic document search",
    "Automatic document processing",
    "Multilingual embedding support",
    "Real-time health monitoring"
  ],
  "services": {
    "embedding": "HuggingFace Inference API (intfloat/multilingual-e5-large)",
    "llm": "HuggingFace Inference API (meta-llama/Meta-Llama-3-8B-Instruct)",
    "vector_store": "HuggingFace Dataset",
    "deployment": "HuggingFace Spaces"
  },
  "api_endpoints": {
    "chat": "POST /chat",
    "search": "POST /search",
    "process": "POST /process-documents",
    "health": "GET /health"
  },
  "documentation": {
    "api_docs": "/docs/api",
    "technical_architecture": "/docs/architecture",
    "deployment_guide": "/docs/deployment"
  },
  "policy_corpus": {
    "total_documents": 22,
    "total_chunks": 98,
    "categories": ["HR", "Finance", "Security", "Operations", "EHS"],
    "last_updated": "2025-10-25T09:15:00Z"
  }
}
```

## Error Handling

### HTTP Status Codes

| Code | Status | Description |
|------|--------|-------------|
| 200 | OK | Request successful |
| 400 | Bad Request | Invalid request parameters |
| 413 | Payload Too Large | Request body too large |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
| 503 | Service Unavailable | HuggingFace API unavailable |

### Error Response Format

```json
{
  "status": "error",
  "error": "Error type",
  "message": "Human-readable error description",
  "error_code": "MACHINE_READABLE_CODE",
  "timestamp": "2025-10-25T10:30:00Z",
  "request_id": "req_abc123",
  "suggestions": [
    "Check your request parameters",
    "Retry with smaller payload"
  ]
}
```

### Common Error Codes

| Error Code | Description | Solution |
|------------|-------------|----------|
| `MESSAGE_TOO_LONG` | Message exceeds character limit | Reduce message length |
| `INVALID_PARAMETERS` | Invalid request parameters | Check parameter types and ranges |
| `HF_API_UNAVAILABLE` | HuggingFace API temporarily unavailable | Retry after delay |
| `RATE_LIMIT_EXCEEDED` | Too many requests | Wait before retrying |
| `EMBEDDING_FAILED` | Embedding generation failed | Check input text format |
| `SEARCH_FAILED` | Vector search failed | Verify query parameters |
| `DATASET_UNAVAILABLE` | HuggingFace Dataset inaccessible | Check dataset permissions |

## Rate Limiting

### HuggingFace Free Tier Limits

- **Inference API**: 1000 requests/hour per model
- **Dataset API**: 100 requests/hour
- **Embedding API**: 1000 requests/hour

### Application Rate Limiting

- **Chat API**: 60 requests/minute per IP
- **Search API**: 120 requests/minute per IP
- **Processing API**: 10 requests/hour per IP

### Rate Limit Headers

```http
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1640995200
X-RateLimit-Window: 60
```

## SDK and Integration Examples

### Python SDK Example

```python
import requests
import json

class PolicyWiseClient:
    def __init__(self, base_url="http://localhost:5000"):
        self.base_url = base_url

    def ask_question(self, question, max_tokens=500):
        """Ask a policy question"""
        response = requests.post(
            f"{self.base_url}/chat",
            json={
                "message": question,
                "max_tokens": max_tokens,
                "include_sources": True
            }
        )
        return response.json()

    def search_policies(self, query, top_k=5):
        """Search policy documents"""
        response = requests.post(
            f"{self.base_url}/search",
            json={
                "query": query,
                "top_k": top_k,
                "threshold": 0.3
            }
        )
        return response.json()

    def check_health(self):
        """Check system health"""
        response = requests.get(f"{self.base_url}/health")
        return response.json()

# Usage
client = PolicyWiseClient("https://your-space.hf.space")

# Ask a question
result = client.ask_question("What is the PTO policy?")
print(f"Response: {result['response']}")
print(f"Sources: {[s['filename'] for s in result['sources']]}")

# Search documents
search_results = client.search_policies("remote work")
for result in search_results['results']:
    print(f"Found: {result['content'][:100]}...")
```

### JavaScript/Node.js Example

```javascript
class PolicyWiseClient {
    constructor(baseUrl = 'http://localhost:5000') {
        this.baseUrl = baseUrl;
    }

    async askQuestion(question, maxTokens = 500) {
        const response = await fetch(`${this.baseUrl}/chat`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                message: question,
                max_tokens: maxTokens,
                include_sources: true
            })
        });
        return await response.json();
    }

    async searchPolicies(query, topK = 5) {
        const response = await fetch(`${this.baseUrl}/search`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                query: query,
                top_k: topK,
                threshold: 0.3
            })
        });
        return await response.json();
    }

    async checkHealth() {
        const response = await fetch(`${this.baseUrl}/health`);
        return await response.json();
    }
}

// Usage
const client = new PolicyWiseClient('https://your-space.hf.space');

// Ask a question
client.askQuestion('What are the expense policies?')
    .then(result => {
        console.log('Response:', result.response);
        console.log('Sources:', result.sources.map(s => s.filename));
    });
```

### cURL Examples

```bash
# Ask a policy question
curl -X POST https://your-space.hf.space/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What is the remote work policy?",
    "max_tokens": 500,
    "include_sources": true
  }'

# Search policy documents
curl -X POST https://your-space.hf.space/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "expense reimbursement",
    "top_k": 3,
    "threshold": 0.4
  }'

# Check system health
curl https://your-space.hf.space/health

# Process documents (admin operation)
curl -X POST https://your-space.hf.space/process-documents \
  -H "Content-Type: application/json" \
  -d '{
    "force_reprocess": false,
    "batch_size": 10
  }'
```

## Performance Guidelines

### Optimization Tips

1. **Batch Requests**: Group multiple questions for better throughput
2. **Cache Results**: Cache frequently asked questions
3. **Optimize Queries**: Use specific, focused questions for better results
4. **Monitor Usage**: Track API usage to stay within rate limits

### Expected Performance

| Operation | Average Time | Throughput |
|-----------|--------------|------------|
| Chat (with sources) | 2-3 seconds | 20-30 req/min |
| Search only | 200-500ms | 60-80 req/min |
| Health check | <100ms | 200+ req/min |
| Document processing | 15-20 seconds | 1 req/hour |

### Monitoring

Monitor these metrics for optimal performance:

- Response time percentiles (p50, p95, p99)
- Error rates by endpoint
- HuggingFace API response times
- Vector store query performance
- Memory and CPU usage

This API documentation provides everything needed to integrate with the PolicyWise HuggingFace-powered RAG system!