# API Documentation - HuggingFace Edition ## Overview PolicyWise provides a RESTful API for corporate policy question-answering using HuggingFace free-tier services. All endpoints return JSON responses and support CORS for web integration. ## Base URL - **Local Development**: `http://localhost:5000` - **HuggingFace Spaces**: `https://your-username-policywise-rag.hf.space` ## Authentication No authentication required for public deployment. For production use, consider implementing API key authentication. ## Core Endpoints ### Chat Endpoint (Primary Interface) **POST /chat** Ask questions about company policies and receive intelligent responses with automatic source citations. #### Request ```http POST /chat Content-Type: application/json { "message": "What is the remote work policy for new employees?", "max_tokens": 500, "include_sources": true, "guardrails_level": "standard" } ``` #### Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | `message` | string | Yes | - | User question about company policies | | `max_tokens` | integer | No | 500 | Maximum response length (100-1000) | | `include_sources` | boolean | No | true | Include source document details | | `guardrails_level` | string | No | "standard" | Safety level: "strict", "standard", "relaxed" | #### Response ```json { "status": "success", "message": "What is the remote work policy for new employees?", "response": "New employees are eligible for remote work after completing their initial 90-day onboarding period. During this period, they must work from the office to facilitate mentoring and team integration. After the probationary period, employees can work remotely up to 3 days per week, subject to manager approval and role requirements. [Source: remote_work_policy.md] [Source: employee_handbook.md]", "confidence": 0.91, "sources": [ { "filename": "remote_work_policy.md", "chunk_id": "remote_work_policy_chunk_3", "relevance_score": 0.89, "content_preview": "New employees must complete a 90-day onboarding period..." }, { "filename": "employee_handbook.md", "chunk_id": "employee_handbook_chunk_7", "relevance_score": 0.76, "content_preview": "Remote work eligibility requirements include..." } ], "response_time_ms": 2340, "guardrails": { "safety_score": 0.98, "quality_score": 0.91, "citation_count": 2 }, "services_used": { "embedding_model": "intfloat/multilingual-e5-large", "llm_model": "meta-llama/Meta-Llama-3-8B-Instruct", "vector_store": "huggingface_dataset" } } ``` #### Error Response ```json { "status": "error", "error": "Request too long", "message": "Message exceeds maximum character limit of 5000", "error_code": "MESSAGE_TOO_LONG" } ``` ### Search Endpoint **POST /search** Perform semantic search across policy documents using HuggingFace embeddings. #### Request ```http POST /search Content-Type: application/json { "query": "What is the remote work policy?", "top_k": 5, "threshold": 0.3, "include_metadata": true } ``` #### Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | `query` | string | Yes | - | Search query text | | `top_k` | integer | No | 5 | Number of results to return (1-20) | | `threshold` | float | No | 0.3 | Minimum similarity threshold (0.0-1.0) | | `include_metadata` | boolean | No | true | Include document metadata | #### Response ```json { "status": "success", "query": "What is the remote work policy?", "results_count": 3, "embedding_model": "intfloat/multilingual-e5-large", "embedding_dimensions": 1024, "results": [ { "chunk_id": "remote_work_policy_chunk_2", "content": "Employees may work remotely up to 3 days per week with manager approval. Remote work arrangements must be documented and reviewed quarterly.", "similarity_score": 0.87, "metadata": { "source_file": "remote_work_policy.md", "chunk_index": 2, "category": "HR", "word_count": 95, "created_at": "2025-10-25T10:30:00Z" } }, { "chunk_id": "remote_work_policy_chunk_1", "content": "Remote work eligibility requires completion of probationary period and manager approval. New employees must work on-site for first 90 days.", "similarity_score": 0.82, "metadata": { "source_file": "remote_work_policy.md", "chunk_index": 1, "category": "HR", "word_count": 88, "created_at": "2025-10-25T10:30:00Z" } } ], "search_time_ms": 234, "vector_store_size": 98 } ``` ### Document Processing **POST /process-documents** Process and embed policy documents using HuggingFace services (automatically run on startup). #### Request ```http POST /process-documents Content-Type: application/json { "force_reprocess": false, "batch_size": 10 } ``` #### Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | `force_reprocess` | boolean | No | false | Force reprocessing even if documents exist | | `batch_size` | integer | No | 10 | Number of documents to process per batch | #### Response ```json { "status": "success", "processing_details": { "files_processed": 22, "chunks_generated": 98, "embeddings_created": 98, "processing_time_seconds": 18.7 }, "embedding_service": { "model": "intfloat/multilingual-e5-large", "dimensions": 1024, "api_status": "operational" }, "vector_store": { "type": "huggingface_dataset", "dataset_name": "policy-vectors", "total_embeddings": 98, "storage_size_mb": 2.4 }, "corpus_statistics": { "total_words": 10637, "average_chunk_size": 95, "documents_by_category": { "HR": 8, "Finance": 4, "Security": 3, "Operations": 4, "EHS": 3 } }, "quality_metrics": { "embedding_generation_success_rate": 1.0, "average_embedding_time_ms": 450, "metadata_completeness": 1.0 } } ``` ### Health Check **GET /health** Comprehensive system health check including all HuggingFace services. #### Request ```http GET /health ``` #### Response ```json { "status": "healthy", "timestamp": "2025-10-25T10:30:00Z", "services": { "hf_embedding_api": "operational", "hf_inference_api": "operational", "hf_dataset_store": "operational" }, "service_details": { "embedding_api": { "model": "intfloat/multilingual-e5-large", "last_request_ms": 450, "requests_today": 247, "error_rate": 0.02 }, "inference_api": { "model": "meta-llama/Meta-Llama-3-8B-Instruct", "last_request_ms": 2340, "requests_today": 89, "error_rate": 0.01 }, "dataset_store": { "dataset_name": "policy-vectors", "total_embeddings": 98, "last_updated": "2025-10-25T09:15:00Z", "access_status": "operational" } }, "configuration": { "use_openai_embedding": false, "hf_token_configured": true, "embedding_model": "intfloat/multilingual-e5-large", "embedding_dimensions": 1024, "deployment_platform": "huggingface_spaces" }, "statistics": { "total_documents": 98, "total_queries_processed": 1247, "average_response_time_ms": 2140, "vector_store_size": 98, "uptime_hours": 72.5 }, "performance": { "memory_usage_mb": 156, "cpu_usage_percent": 12, "disk_usage_mb": 45, "cache_hit_rate": 0.78 } } ``` ### System Information **GET /** Welcome page with system information and capabilities. #### Response ```json { "message": "Welcome to PolicyWise - HuggingFace Edition", "version": "2.0.0-hf", "description": "Corporate policy RAG system powered by HuggingFace free-tier services", "capabilities": [ "Policy question answering with citations", "Semantic document search", "Automatic document processing", "Multilingual embedding support", "Real-time health monitoring" ], "services": { "embedding": "HuggingFace Inference API (intfloat/multilingual-e5-large)", "llm": "HuggingFace Inference API (meta-llama/Meta-Llama-3-8B-Instruct)", "vector_store": "HuggingFace Dataset", "deployment": "HuggingFace Spaces" }, "api_endpoints": { "chat": "POST /chat", "search": "POST /search", "process": "POST /process-documents", "health": "GET /health" }, "documentation": { "api_docs": "/docs/api", "technical_architecture": "/docs/architecture", "deployment_guide": "/docs/deployment" }, "policy_corpus": { "total_documents": 22, "total_chunks": 98, "categories": ["HR", "Finance", "Security", "Operations", "EHS"], "last_updated": "2025-10-25T09:15:00Z" } } ``` ## Error Handling ### HTTP Status Codes | Code | Status | Description | |------|--------|-------------| | 200 | OK | Request successful | | 400 | Bad Request | Invalid request parameters | | 413 | Payload Too Large | Request body too large | | 429 | Too Many Requests | Rate limit exceeded | | 500 | Internal Server Error | Server error | | 503 | Service Unavailable | HuggingFace API unavailable | ### Error Response Format ```json { "status": "error", "error": "Error type", "message": "Human-readable error description", "error_code": "MACHINE_READABLE_CODE", "timestamp": "2025-10-25T10:30:00Z", "request_id": "req_abc123", "suggestions": [ "Check your request parameters", "Retry with smaller payload" ] } ``` ### Common Error Codes | Error Code | Description | Solution | |------------|-------------|----------| | `MESSAGE_TOO_LONG` | Message exceeds character limit | Reduce message length | | `INVALID_PARAMETERS` | Invalid request parameters | Check parameter types and ranges | | `HF_API_UNAVAILABLE` | HuggingFace API temporarily unavailable | Retry after delay | | `RATE_LIMIT_EXCEEDED` | Too many requests | Wait before retrying | | `EMBEDDING_FAILED` | Embedding generation failed | Check input text format | | `SEARCH_FAILED` | Vector search failed | Verify query parameters | | `DATASET_UNAVAILABLE` | HuggingFace Dataset inaccessible | Check dataset permissions | ## Rate Limiting ### HuggingFace Free Tier Limits - **Inference API**: 1000 requests/hour per model - **Dataset API**: 100 requests/hour - **Embedding API**: 1000 requests/hour ### Application Rate Limiting - **Chat API**: 60 requests/minute per IP - **Search API**: 120 requests/minute per IP - **Processing API**: 10 requests/hour per IP ### Rate Limit Headers ```http X-RateLimit-Limit: 60 X-RateLimit-Remaining: 45 X-RateLimit-Reset: 1640995200 X-RateLimit-Window: 60 ``` ## SDK and Integration Examples ### Python SDK Example ```python import requests import json class PolicyWiseClient: def __init__(self, base_url="http://localhost:5000"): self.base_url = base_url def ask_question(self, question, max_tokens=500): """Ask a policy question""" response = requests.post( f"{self.base_url}/chat", json={ "message": question, "max_tokens": max_tokens, "include_sources": True } ) return response.json() def search_policies(self, query, top_k=5): """Search policy documents""" response = requests.post( f"{self.base_url}/search", json={ "query": query, "top_k": top_k, "threshold": 0.3 } ) return response.json() def check_health(self): """Check system health""" response = requests.get(f"{self.base_url}/health") return response.json() # Usage client = PolicyWiseClient("https://your-space.hf.space") # Ask a question result = client.ask_question("What is the PTO policy?") print(f"Response: {result['response']}") print(f"Sources: {[s['filename'] for s in result['sources']]}") # Search documents search_results = client.search_policies("remote work") for result in search_results['results']: print(f"Found: {result['content'][:100]}...") ``` ### JavaScript/Node.js Example ```javascript class PolicyWiseClient { constructor(baseUrl = 'http://localhost:5000') { this.baseUrl = baseUrl; } async askQuestion(question, maxTokens = 500) { const response = await fetch(`${this.baseUrl}/chat`, { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ message: question, max_tokens: maxTokens, include_sources: true }) }); return await response.json(); } async searchPolicies(query, topK = 5) { const response = await fetch(`${this.baseUrl}/search`, { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ query: query, top_k: topK, threshold: 0.3 }) }); return await response.json(); } async checkHealth() { const response = await fetch(`${this.baseUrl}/health`); return await response.json(); } } // Usage const client = new PolicyWiseClient('https://your-space.hf.space'); // Ask a question client.askQuestion('What are the expense policies?') .then(result => { console.log('Response:', result.response); console.log('Sources:', result.sources.map(s => s.filename)); }); ``` ### cURL Examples ```bash # Ask a policy question curl -X POST https://your-space.hf.space/chat \ -H "Content-Type: application/json" \ -d '{ "message": "What is the remote work policy?", "max_tokens": 500, "include_sources": true }' # Search policy documents curl -X POST https://your-space.hf.space/search \ -H "Content-Type: application/json" \ -d '{ "query": "expense reimbursement", "top_k": 3, "threshold": 0.4 }' # Check system health curl https://your-space.hf.space/health # Process documents (admin operation) curl -X POST https://your-space.hf.space/process-documents \ -H "Content-Type: application/json" \ -d '{ "force_reprocess": false, "batch_size": 10 }' ``` ## Performance Guidelines ### Optimization Tips 1. **Batch Requests**: Group multiple questions for better throughput 2. **Cache Results**: Cache frequently asked questions 3. **Optimize Queries**: Use specific, focused questions for better results 4. **Monitor Usage**: Track API usage to stay within rate limits ### Expected Performance | Operation | Average Time | Throughput | |-----------|--------------|------------| | Chat (with sources) | 2-3 seconds | 20-30 req/min | | Search only | 200-500ms | 60-80 req/min | | Health check | <100ms | 200+ req/min | | Document processing | 15-20 seconds | 1 req/hour | ### Monitoring Monitor these metrics for optimal performance: - Response time percentiles (p50, p95, p99) - Error rates by endpoint - HuggingFace API response times - Vector store query performance - Memory and CPU usage This API documentation provides everything needed to integrate with the PolicyWise HuggingFace-powered RAG system!