Spaces:
Sleeping
Sleeping
| # API Documentation - HuggingFace Edition | |
| ## Overview | |
| PolicyWise provides a RESTful API for corporate policy question-answering using HuggingFace free-tier services. All endpoints return JSON responses and support CORS for web integration. | |
| ## Base URL | |
| - **Local Development**: `http://localhost:5000` | |
| - **HuggingFace Spaces**: `https://your-username-policywise-rag.hf.space` | |
| ## Authentication | |
| No authentication required for public deployment. For production use, consider implementing API key authentication. | |
| ## Core Endpoints | |
| ### Chat Endpoint (Primary Interface) | |
| **POST /chat** | |
| Ask questions about company policies and receive intelligent responses with automatic source citations. | |
| #### Request | |
| ```http | |
| POST /chat | |
| Content-Type: application/json | |
| { | |
| "message": "What is the remote work policy for new employees?", | |
| "max_tokens": 500, | |
| "include_sources": true, | |
| "guardrails_level": "standard" | |
| } | |
| ``` | |
| #### Parameters | |
| | Parameter | Type | Required | Default | Description | | |
| |-----------|------|----------|---------|-------------| | |
| | `message` | string | Yes | - | User question about company policies | | |
| | `max_tokens` | integer | No | 500 | Maximum response length (100-1000) | | |
| | `include_sources` | boolean | No | true | Include source document details | | |
| | `guardrails_level` | string | No | "standard" | Safety level: "strict", "standard", "relaxed" | | |
| #### Response | |
| ```json | |
| { | |
| "status": "success", | |
| "message": "What is the remote work policy for new employees?", | |
| "response": "New employees are eligible for remote work after completing their initial 90-day onboarding period. During this period, they must work from the office to facilitate mentoring and team integration. After the probationary period, employees can work remotely up to 3 days per week, subject to manager approval and role requirements. [Source: remote_work_policy.md] [Source: employee_handbook.md]", | |
| "confidence": 0.91, | |
| "sources": [ | |
| { | |
| "filename": "remote_work_policy.md", | |
| "chunk_id": "remote_work_policy_chunk_3", | |
| "relevance_score": 0.89, | |
| "content_preview": "New employees must complete a 90-day onboarding period..." | |
| }, | |
| { | |
| "filename": "employee_handbook.md", | |
| "chunk_id": "employee_handbook_chunk_7", | |
| "relevance_score": 0.76, | |
| "content_preview": "Remote work eligibility requirements include..." | |
| } | |
| ], | |
| "response_time_ms": 2340, | |
| "guardrails": { | |
| "safety_score": 0.98, | |
| "quality_score": 0.91, | |
| "citation_count": 2 | |
| }, | |
| "services_used": { | |
| "embedding_model": "intfloat/multilingual-e5-large", | |
| "llm_model": "meta-llama/Meta-Llama-3-8B-Instruct", | |
| "vector_store": "huggingface_dataset" | |
| } | |
| } | |
| ``` | |
| #### Error Response | |
| ```json | |
| { | |
| "status": "error", | |
| "error": "Request too long", | |
| "message": "Message exceeds maximum character limit of 5000", | |
| "error_code": "MESSAGE_TOO_LONG" | |
| } | |
| ``` | |
| ### Search Endpoint | |
| **POST /search** | |
| Perform semantic search across policy documents using HuggingFace embeddings. | |
| #### Request | |
| ```http | |
| POST /search | |
| Content-Type: application/json | |
| { | |
| "query": "What is the remote work policy?", | |
| "top_k": 5, | |
| "threshold": 0.3, | |
| "include_metadata": true | |
| } | |
| ``` | |
| #### Parameters | |
| | Parameter | Type | Required | Default | Description | | |
| |-----------|------|----------|---------|-------------| | |
| | `query` | string | Yes | - | Search query text | | |
| | `top_k` | integer | No | 5 | Number of results to return (1-20) | | |
| | `threshold` | float | No | 0.3 | Minimum similarity threshold (0.0-1.0) | | |
| | `include_metadata` | boolean | No | true | Include document metadata | | |
| #### Response | |
| ```json | |
| { | |
| "status": "success", | |
| "query": "What is the remote work policy?", | |
| "results_count": 3, | |
| "embedding_model": "intfloat/multilingual-e5-large", | |
| "embedding_dimensions": 1024, | |
| "results": [ | |
| { | |
| "chunk_id": "remote_work_policy_chunk_2", | |
| "content": "Employees may work remotely up to 3 days per week with manager approval. Remote work arrangements must be documented and reviewed quarterly.", | |
| "similarity_score": 0.87, | |
| "metadata": { | |
| "source_file": "remote_work_policy.md", | |
| "chunk_index": 2, | |
| "category": "HR", | |
| "word_count": 95, | |
| "created_at": "2025-10-25T10:30:00Z" | |
| } | |
| }, | |
| { | |
| "chunk_id": "remote_work_policy_chunk_1", | |
| "content": "Remote work eligibility requires completion of probationary period and manager approval. New employees must work on-site for first 90 days.", | |
| "similarity_score": 0.82, | |
| "metadata": { | |
| "source_file": "remote_work_policy.md", | |
| "chunk_index": 1, | |
| "category": "HR", | |
| "word_count": 88, | |
| "created_at": "2025-10-25T10:30:00Z" | |
| } | |
| } | |
| ], | |
| "search_time_ms": 234, | |
| "vector_store_size": 98 | |
| } | |
| ``` | |
| ### Document Processing | |
| **POST /process-documents** | |
| Process and embed policy documents using HuggingFace services (automatically run on startup). | |
| #### Request | |
| ```http | |
| POST /process-documents | |
| Content-Type: application/json | |
| { | |
| "force_reprocess": false, | |
| "batch_size": 10 | |
| } | |
| ``` | |
| #### Parameters | |
| | Parameter | Type | Required | Default | Description | | |
| |-----------|------|----------|---------|-------------| | |
| | `force_reprocess` | boolean | No | false | Force reprocessing even if documents exist | | |
| | `batch_size` | integer | No | 10 | Number of documents to process per batch | | |
| #### Response | |
| ```json | |
| { | |
| "status": "success", | |
| "processing_details": { | |
| "files_processed": 22, | |
| "chunks_generated": 98, | |
| "embeddings_created": 98, | |
| "processing_time_seconds": 18.7 | |
| }, | |
| "embedding_service": { | |
| "model": "intfloat/multilingual-e5-large", | |
| "dimensions": 1024, | |
| "api_status": "operational" | |
| }, | |
| "vector_store": { | |
| "type": "huggingface_dataset", | |
| "dataset_name": "policy-vectors", | |
| "total_embeddings": 98, | |
| "storage_size_mb": 2.4 | |
| }, | |
| "corpus_statistics": { | |
| "total_words": 10637, | |
| "average_chunk_size": 95, | |
| "documents_by_category": { | |
| "HR": 8, | |
| "Finance": 4, | |
| "Security": 3, | |
| "Operations": 4, | |
| "EHS": 3 | |
| } | |
| }, | |
| "quality_metrics": { | |
| "embedding_generation_success_rate": 1.0, | |
| "average_embedding_time_ms": 450, | |
| "metadata_completeness": 1.0 | |
| } | |
| } | |
| ``` | |
| ### Health Check | |
| **GET /health** | |
| Comprehensive system health check including all HuggingFace services. | |
| #### Request | |
| ```http | |
| GET /health | |
| ``` | |
| #### Response | |
| ```json | |
| { | |
| "status": "healthy", | |
| "timestamp": "2025-10-25T10:30:00Z", | |
| "services": { | |
| "hf_embedding_api": "operational", | |
| "hf_inference_api": "operational", | |
| "hf_dataset_store": "operational" | |
| }, | |
| "service_details": { | |
| "embedding_api": { | |
| "model": "intfloat/multilingual-e5-large", | |
| "last_request_ms": 450, | |
| "requests_today": 247, | |
| "error_rate": 0.02 | |
| }, | |
| "inference_api": { | |
| "model": "meta-llama/Meta-Llama-3-8B-Instruct", | |
| "last_request_ms": 2340, | |
| "requests_today": 89, | |
| "error_rate": 0.01 | |
| }, | |
| "dataset_store": { | |
| "dataset_name": "policy-vectors", | |
| "total_embeddings": 98, | |
| "last_updated": "2025-10-25T09:15:00Z", | |
| "access_status": "operational" | |
| } | |
| }, | |
| "configuration": { | |
| "use_openai_embedding": false, | |
| "hf_token_configured": true, | |
| "embedding_model": "intfloat/multilingual-e5-large", | |
| "embedding_dimensions": 1024, | |
| "deployment_platform": "huggingface_spaces" | |
| }, | |
| "statistics": { | |
| "total_documents": 98, | |
| "total_queries_processed": 1247, | |
| "average_response_time_ms": 2140, | |
| "vector_store_size": 98, | |
| "uptime_hours": 72.5 | |
| }, | |
| "performance": { | |
| "memory_usage_mb": 156, | |
| "cpu_usage_percent": 12, | |
| "disk_usage_mb": 45, | |
| "cache_hit_rate": 0.78 | |
| } | |
| } | |
| ``` | |
| ### System Information | |
| **GET /** | |
| Welcome page with system information and capabilities. | |
| #### Response | |
| ```json | |
| { | |
| "message": "Welcome to PolicyWise - HuggingFace Edition", | |
| "version": "2.0.0-hf", | |
| "description": "Corporate policy RAG system powered by HuggingFace free-tier services", | |
| "capabilities": [ | |
| "Policy question answering with citations", | |
| "Semantic document search", | |
| "Automatic document processing", | |
| "Multilingual embedding support", | |
| "Real-time health monitoring" | |
| ], | |
| "services": { | |
| "embedding": "HuggingFace Inference API (intfloat/multilingual-e5-large)", | |
| "llm": "HuggingFace Inference API (meta-llama/Meta-Llama-3-8B-Instruct)", | |
| "vector_store": "HuggingFace Dataset", | |
| "deployment": "HuggingFace Spaces" | |
| }, | |
| "api_endpoints": { | |
| "chat": "POST /chat", | |
| "search": "POST /search", | |
| "process": "POST /process-documents", | |
| "health": "GET /health" | |
| }, | |
| "documentation": { | |
| "api_docs": "/docs/api", | |
| "technical_architecture": "/docs/architecture", | |
| "deployment_guide": "/docs/deployment" | |
| }, | |
| "policy_corpus": { | |
| "total_documents": 22, | |
| "total_chunks": 98, | |
| "categories": ["HR", "Finance", "Security", "Operations", "EHS"], | |
| "last_updated": "2025-10-25T09:15:00Z" | |
| } | |
| } | |
| ``` | |
| ## Error Handling | |
| ### HTTP Status Codes | |
| | Code | Status | Description | | |
| |------|--------|-------------| | |
| | 200 | OK | Request successful | | |
| | 400 | Bad Request | Invalid request parameters | | |
| | 413 | Payload Too Large | Request body too large | | |
| | 429 | Too Many Requests | Rate limit exceeded | | |
| | 500 | Internal Server Error | Server error | | |
| | 503 | Service Unavailable | HuggingFace API unavailable | | |
| ### Error Response Format | |
| ```json | |
| { | |
| "status": "error", | |
| "error": "Error type", | |
| "message": "Human-readable error description", | |
| "error_code": "MACHINE_READABLE_CODE", | |
| "timestamp": "2025-10-25T10:30:00Z", | |
| "request_id": "req_abc123", | |
| "suggestions": [ | |
| "Check your request parameters", | |
| "Retry with smaller payload" | |
| ] | |
| } | |
| ``` | |
| ### Common Error Codes | |
| | Error Code | Description | Solution | | |
| |------------|-------------|----------| | |
| | `MESSAGE_TOO_LONG` | Message exceeds character limit | Reduce message length | | |
| | `INVALID_PARAMETERS` | Invalid request parameters | Check parameter types and ranges | | |
| | `HF_API_UNAVAILABLE` | HuggingFace API temporarily unavailable | Retry after delay | | |
| | `RATE_LIMIT_EXCEEDED` | Too many requests | Wait before retrying | | |
| | `EMBEDDING_FAILED` | Embedding generation failed | Check input text format | | |
| | `SEARCH_FAILED` | Vector search failed | Verify query parameters | | |
| | `DATASET_UNAVAILABLE` | HuggingFace Dataset inaccessible | Check dataset permissions | | |
| ## Rate Limiting | |
| ### HuggingFace Free Tier Limits | |
| - **Inference API**: 1000 requests/hour per model | |
| - **Dataset API**: 100 requests/hour | |
| - **Embedding API**: 1000 requests/hour | |
| ### Application Rate Limiting | |
| - **Chat API**: 60 requests/minute per IP | |
| - **Search API**: 120 requests/minute per IP | |
| - **Processing API**: 10 requests/hour per IP | |
| ### Rate Limit Headers | |
| ```http | |
| X-RateLimit-Limit: 60 | |
| X-RateLimit-Remaining: 45 | |
| X-RateLimit-Reset: 1640995200 | |
| X-RateLimit-Window: 60 | |
| ``` | |
| ## SDK and Integration Examples | |
| ### Python SDK Example | |
| ```python | |
| import requests | |
| import json | |
| class PolicyWiseClient: | |
| def __init__(self, base_url="http://localhost:5000"): | |
| self.base_url = base_url | |
| def ask_question(self, question, max_tokens=500): | |
| """Ask a policy question""" | |
| response = requests.post( | |
| f"{self.base_url}/chat", | |
| json={ | |
| "message": question, | |
| "max_tokens": max_tokens, | |
| "include_sources": True | |
| } | |
| ) | |
| return response.json() | |
| def search_policies(self, query, top_k=5): | |
| """Search policy documents""" | |
| response = requests.post( | |
| f"{self.base_url}/search", | |
| json={ | |
| "query": query, | |
| "top_k": top_k, | |
| "threshold": 0.3 | |
| } | |
| ) | |
| return response.json() | |
| def check_health(self): | |
| """Check system health""" | |
| response = requests.get(f"{self.base_url}/health") | |
| return response.json() | |
| # Usage | |
| client = PolicyWiseClient("https://your-space.hf.space") | |
| # Ask a question | |
| result = client.ask_question("What is the PTO policy?") | |
| print(f"Response: {result['response']}") | |
| print(f"Sources: {[s['filename'] for s in result['sources']]}") | |
| # Search documents | |
| search_results = client.search_policies("remote work") | |
| for result in search_results['results']: | |
| print(f"Found: {result['content'][:100]}...") | |
| ``` | |
| ### JavaScript/Node.js Example | |
| ```javascript | |
| class PolicyWiseClient { | |
| constructor(baseUrl = 'http://localhost:5000') { | |
| this.baseUrl = baseUrl; | |
| } | |
| async askQuestion(question, maxTokens = 500) { | |
| const response = await fetch(`${this.baseUrl}/chat`, { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify({ | |
| message: question, | |
| max_tokens: maxTokens, | |
| include_sources: true | |
| }) | |
| }); | |
| return await response.json(); | |
| } | |
| async searchPolicies(query, topK = 5) { | |
| const response = await fetch(`${this.baseUrl}/search`, { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify({ | |
| query: query, | |
| top_k: topK, | |
| threshold: 0.3 | |
| }) | |
| }); | |
| return await response.json(); | |
| } | |
| async checkHealth() { | |
| const response = await fetch(`${this.baseUrl}/health`); | |
| return await response.json(); | |
| } | |
| } | |
| // Usage | |
| const client = new PolicyWiseClient('https://your-space.hf.space'); | |
| // Ask a question | |
| client.askQuestion('What are the expense policies?') | |
| .then(result => { | |
| console.log('Response:', result.response); | |
| console.log('Sources:', result.sources.map(s => s.filename)); | |
| }); | |
| ``` | |
| ### cURL Examples | |
| ```bash | |
| # Ask a policy question | |
| curl -X POST https://your-space.hf.space/chat \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "message": "What is the remote work policy?", | |
| "max_tokens": 500, | |
| "include_sources": true | |
| }' | |
| # Search policy documents | |
| curl -X POST https://your-space.hf.space/search \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "query": "expense reimbursement", | |
| "top_k": 3, | |
| "threshold": 0.4 | |
| }' | |
| # Check system health | |
| curl https://your-space.hf.space/health | |
| # Process documents (admin operation) | |
| curl -X POST https://your-space.hf.space/process-documents \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "force_reprocess": false, | |
| "batch_size": 10 | |
| }' | |
| ``` | |
| ## Performance Guidelines | |
| ### Optimization Tips | |
| 1. **Batch Requests**: Group multiple questions for better throughput | |
| 2. **Cache Results**: Cache frequently asked questions | |
| 3. **Optimize Queries**: Use specific, focused questions for better results | |
| 4. **Monitor Usage**: Track API usage to stay within rate limits | |
| ### Expected Performance | |
| | Operation | Average Time | Throughput | | |
| |-----------|--------------|------------| | |
| | Chat (with sources) | 2-3 seconds | 20-30 req/min | | |
| | Search only | 200-500ms | 60-80 req/min | | |
| | Health check | <100ms | 200+ req/min | | |
| | Document processing | 15-20 seconds | 1 req/hour | | |
| ### Monitoring | |
| Monitor these metrics for optimal performance: | |
| - Response time percentiles (p50, p95, p99) | |
| - Error rates by endpoint | |
| - HuggingFace API response times | |
| - Vector store query performance | |
| - Memory and CPU usage | |
| This API documentation provides everything needed to integrate with the PolicyWise HuggingFace-powered RAG system! | |