| # Getting Started with Unified AI Services | |
| This guide will walk you through setting up and running the complete Unified AI Services system. | |
| ## π Quick Overview | |
| The Unified AI Services system consists of: | |
| - **NER Service** (Port 8500): Named Entity Recognition with relationship extraction | |
| - **OCR Service** (Port 8400): Optical Character Recognition with document processing | |
| - **RAG Service** (Port 8401): Retrieval-Augmented Generation with vector search | |
| - **Unified App** (Port 8000): Main application coordinating all services | |
| ## π Quick Start (Recommended) | |
| ### Step 1: Automated Setup | |
| ```bash | |
| # Run the automated setup wizard | |
| python setup.py | |
| ``` | |
| This will: | |
| - β Check your Python environment | |
| - β Create necessary directories | |
| - β Help configure your .env file | |
| - β Install dependencies | |
| - β Validate configuration | |
| - β Create startup scripts | |
| ### Step 2: Start the System | |
| ```bash | |
| # Start all services automatically | |
| python app.py | |
| ``` | |
| Or use the generated scripts: | |
| - **Windows**: Double-click `start_services.bat` | |
| - **Linux/Mac**: Run `./start_services.sh` | |
| ### Step 3: Test the System | |
| ```bash | |
| # Run comprehensive tests | |
| python test_unified.py | |
| ``` | |
| Or use the generated scripts: | |
| - **Windows**: Double-click `run_tests.bat` | |
| - **Linux/Mac**: Run `./run_tests.sh` | |
| ### Step 4: Try the Demo | |
| ```bash | |
| # Run interactive demo | |
| python demo.py | |
| ``` | |
| ## π File Structure | |
| After setup, your directory should look like this: | |
| ``` | |
| unified-ai-services/ | |
| βββ app.py # π Main unified application | |
| βββ configs.py # βοΈ Configuration management | |
| βββ setup.py # π οΈ Automated setup script | |
| βββ manage_services.py # π§ Service management tool | |
| βββ test_unified.py # π§ͺ Comprehensive test suite | |
| βββ demo.py # π¬ Interactive demo | |
| βββ requirements.txt # π¦ Python dependencies | |
| βββ .env # π Environment configuration | |
| βββ README.md # π Documentation | |
| βββ GETTING_STARTED.md # π This file | |
| βββ services/ # π Service implementations | |
| β βββ ner_service.py # Named Entity Recognition | |
| β βββ ocr_service.py # Optical Character Recognition | |
| β βββ rag_service.py # Retrieval-Augmented Generation | |
| βββ exports/ # π Generated export files | |
| βββ logs/ # π Application logs | |
| βββ temp/ # ποΈ Temporary files | |
| ``` | |
| ## βοΈ Manual Setup (Alternative) | |
| If you prefer manual setup: | |
| ### Prerequisites | |
| - Python 3.8 or higher | |
| - PostgreSQL with vector extension | |
| - Azure OpenAI account | |
| - Azure Document Intelligence account | |
| - DeepSeek API account | |
| ### 1. Install Dependencies | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### 2. Configure Environment | |
| Create a `.env` file with your configuration: | |
| ```bash | |
| # Server Configuration | |
| HOST=0.0.0.0 | |
| MAIN_PORT=8000 | |
| NER_PORT=8500 | |
| OCR_PORT=8400 | |
| RAG_PORT=8401 | |
| # PostgreSQL Configuration | |
| POSTGRES_HOST=your-postgres-server.com | |
| POSTGRES_PORT=5432 | |
| POSTGRES_USER=your-username | |
| POSTGRES_PASSWORD=your-password | |
| POSTGRES_DATABASE=postgres | |
| # Azure OpenAI Configuration | |
| AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com/ | |
| AZURE_OPENAI_API_KEY=your-api-key | |
| EMBEDDING_MODEL=text-embedding-3-large | |
| # DeepSeek Configuration (for advanced NER) | |
| DEEPSEEK_ENDPOINT=https://your-deepseek-endpoint/ | |
| DEEPSEEK_API_KEY=your-deepseek-key | |
| DEEPSEEK_MODEL=DeepSeek-R1-0528 | |
| # Azure Document Intelligence Configuration | |
| AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-di.cognitiveservices.azure.com/ | |
| AZURE_DOCUMENT_INTELLIGENCE_KEY=your-di-key | |
| # Azure Storage Configuration | |
| AZURE_STORAGE_ACCOUNT_URL=https://yourstorage.blob.core.windows.net/ | |
| AZURE_BLOB_SAS_TOKEN=your-sas-token | |
| BLOB_CONTAINER=historylog | |
| ``` | |
| ### 3. Create Directory Structure | |
| ```bash | |
| mkdir -p services exports logs temp tests data | |
| ``` | |
| ### 4. Place Service Files | |
| Ensure your service files are in the correct locations: | |
| - `services/ner_service.py` | |
| - `services/ocr_service.py` | |
| - `services/rag_service.py` | |
| ## π§ Service Management | |
| ### Using the Service Manager | |
| The `manage_services.py` script provides easy service management: | |
| ```bash | |
| # Start individual services | |
| python manage_services.py start ner | |
| python manage_services.py start ocr | |
| python manage_services.py start rag | |
| python manage_services.py start unified | |
| # Start all services | |
| python manage_services.py start all | |
| # Check status | |
| python manage_services.py status | |
| # Test services | |
| python manage_services.py test ner | |
| python manage_services.py test all | |
| # Stop services | |
| python manage_services.py stop all | |
| # Restart services | |
| python manage_services.py restart all | |
| # List available services | |
| python manage_services.py list | |
| ``` | |
| ### Direct Service Management | |
| Start services individually for development: | |
| ```bash | |
| # Terminal 1: Start OCR service | |
| cd services && python ocr_service.py | |
| # Terminal 2: Start RAG service | |
| cd services && python rag_service.py | |
| # Terminal 3: Start NER service | |
| cd services && python ner_service.py | |
| # Terminal 4: Start unified application | |
| python app.py | |
| ``` | |
| ## π§ͺ Testing and Validation | |
| ### Comprehensive System Tests | |
| ```bash | |
| # Run all tests | |
| python test_unified.py | |
| # Test output will show: | |
| # β Unified App Health Check | |
| # β Individual Service Health | |
| # β Unified Analysis (Text) | |
| # β Unified Analysis (URL) | |
| # β Combined Search | |
| # β Service Proxies | |
| # β File Upload (Unified) | |
| # β Service Discovery | |
| # β System Performance | |
| # β Error Handling | |
| ``` | |
| ### Individual Service Tests | |
| ```bash | |
| # Test NER service specifically | |
| python test_ner.py | |
| # Test RAG service specifically | |
| python test_rag.py | |
| ``` | |
| ### Quick Health Checks | |
| ```bash | |
| # Check unified system | |
| curl http://localhost:8000/health | |
| # Check individual services | |
| curl http://localhost:8500/health # NER | |
| curl http://localhost:8400/health # OCR | |
| curl http://localhost:8401/health # RAG | |
| ``` | |
| ## π¬ Interactive Demo | |
| The demo script showcases all system capabilities: | |
| ```bash | |
| python demo.py | |
| ``` | |
| Demo includes: | |
| - Multi-language text analysis (Thai + English) | |
| - Entity and relationship extraction | |
| - RAG document indexing | |
| - Combined search functionality | |
| - Service proxy testing | |
| - Real-time performance monitoring | |
| ## π API Usage | |
| ### API Documentation | |
| Once running, access interactive documentation: | |
| - **Unified API**: http://localhost:8000/docs | |
| - **NER Service**: http://localhost:8500/docs | |
| - **OCR Service**: http://localhost:8400/docs | |
| - **RAG Service**: http://localhost:8401/docs | |
| ### Key Endpoints | |
| #### Unified Analysis | |
| ```python | |
| # Analyze text with automatic RAG indexing | |
| POST /analyze/unified | |
| { | |
| "text": "Your text here...", | |
| "extract_relationships": true, | |
| "enable_rag_indexing": true, | |
| "rag_title": "Document Title" | |
| } | |
| ``` | |
| #### Combined Search | |
| ```python | |
| # Search with automatic NER enhancement | |
| POST /search/combined | |
| { | |
| "query": "search terms", | |
| "include_ner_analysis": true, | |
| "limit": 10 | |
| } | |
| ``` | |
| #### Service Proxies | |
| ```python | |
| # Direct access to individual services | |
| POST /ner/analyze/text # NER analysis | |
| POST /ocr/upload # OCR processing | |
| POST /rag/search # RAG search | |
| GET /rag/documents # List documents | |
| ``` | |
| ## π Health Monitoring | |
| ### System Status | |
| ```bash | |
| # Get overall system health | |
| GET /health | |
| # Get detailed status | |
| GET /status | |
| # Discover available services | |
| GET /services | |
| ``` | |
| ### Service Monitoring | |
| Each service provides health information: | |
| - Response times | |
| - Uptime | |
| - Resource usage | |
| - Configuration status | |
| - Error rates | |
| ## π οΈ Troubleshooting | |
| ### Common Issues | |
| #### 1. Services Won't Start | |
| **Check ports:** | |
| ```bash | |
| netstat -an | grep :8000 | |
| netstat -an | grep :8500 | |
| netstat -an | grep :8400 | |
| netstat -an | grep :8401 | |
| ``` | |
| **Verify configuration:** | |
| ```bash | |
| python configs.py | |
| ``` | |
| **Check dependencies:** | |
| ```bash | |
| pip list | grep fastapi | |
| pip list | grep asyncpg | |
| ``` | |
| #### 2. Database Connection Issues | |
| **Test connection:** | |
| ```bash | |
| # Use your actual connection details | |
| python -c " | |
| import asyncio | |
| import asyncpg | |
| async def test(): | |
| conn = await asyncpg.connect('postgresql://user:pass@host:5432/db') | |
| print('Connected successfully') | |
| await conn.close() | |
| asyncio.run(test()) | |
| " | |
| ``` | |
| **Common fixes:** | |
| - Verify PostgreSQL is running | |
| - Check firewall rules | |
| - Confirm SSL requirements | |
| - Validate credentials | |
| #### 3. Azure Service Issues | |
| **Check API keys:** | |
| ```bash | |
| # Test Azure OpenAI | |
| curl -H "api-key: YOUR_KEY" "YOUR_ENDPOINT/openai/deployments/YOUR_MODEL/embeddings?api-version=2024-02-01" | |
| # Test Document Intelligence | |
| curl -H "Ocp-Apim-Subscription-Key: YOUR_KEY" "YOUR_ENDPOINT/formrecognizer/info?api-version=2023-07-31" | |
| ``` | |
| **Common fixes:** | |
| - Verify API keys are correct | |
| - Check service regions | |
| - Confirm quota limits | |
| - Validate endpoint URLs | |
| #### 4. Performance Issues | |
| **Monitor resources:** | |
| ```bash | |
| # Check system resources | |
| top | |
| htop | |
| python manage_services.py status | |
| ``` | |
| **Common solutions:** | |
| - Increase system memory | |
| - Optimize database queries | |
| - Reduce concurrent requests | |
| - Check network latency | |
| ### Getting Help | |
| 1. **Check logs**: Services log to console | |
| 2. **Run health checks**: Use `/health` endpoints | |
| 3. **Validate configuration**: Run `python configs.py` | |
| 4. **Test individual services**: Use service manager | |
| 5. **Check database connectivity**: Test connection strings | |
| 6. **Verify Azure services**: Check API endpoints | |
| ### Debug Mode | |
| Enable debug mode for detailed logging: | |
| ```bash | |
| # In .env file | |
| DEBUG=True | |
| # Or set environment variable | |
| export DEBUG=true | |
| python app.py | |
| ``` | |
| ## π Production Deployment | |
| ### Security Considerations | |
| 1. **Environment Variables**: Use secure secret management | |
| 2. **HTTPS**: Enable SSL/TLS in production | |
| 3. **Authentication**: Implement API authentication | |
| 4. **Rate Limiting**: Add request rate limiting | |
| 5. **Input Validation**: Validate all input data | |
| ### Performance Optimization | |
| 1. **Caching**: Implement Redis caching | |
| 2. **Load Balancing**: Use reverse proxy (nginx) | |
| 3. **Database**: Optimize PostgreSQL configuration | |
| 4. **Monitoring**: Set up application monitoring | |
| 5. **Scaling**: Consider horizontal scaling | |
| ### Deployment Options | |
| 1. **Docker**: Containerize services | |
| 2. **Cloud**: Deploy to Azure/AWS/GCP | |
| 3. **Kubernetes**: Orchestrate with k8s | |
| 4. **CI/CD**: Automate deployments | |
| ## π Next Steps | |
| After successful setup: | |
| 1. **Explore the API**: Use the interactive documentation | |
| 2. **Try the demo**: Run `python demo.py` | |
| 3. **Run tests**: Execute `python test_unified.py` | |
| 4. **Monitor system**: Check health endpoints | |
| 5. **Customize**: Modify services for your needs | |
| 6. **Scale**: Consider production deployment | |
| ## π― Success Indicators | |
| You know the system is working when: | |
| - β All health checks pass | |
| - β Tests complete successfully | |
| - β Demo runs without errors | |
| - β API documentation is accessible | |
| - β Services respond to requests | |
| - β Database connections work | |
| - β Azure integrations function | |
| - β File uploads process correctly | |
| - β Search returns results | |
| - β Export files generate properly | |
| **Congratulations! Your Unified AI Services system is ready to use! π** |