Spaces:
Build error
A newer version of the Gradio SDK is available: 6.13.0
Multi-Agent Platform API Deployment Guide
Overview
This guide provides comprehensive instructions for deploying the Multi-Agent Platform API server using Docker, Docker Compose, and Kubernetes. The platform includes real-time monitoring, load balancing, and production-ready configurations.
Prerequisites
- Docker and Docker Compose installed
- Kubernetes cluster (for K8s deployment)
- Redis server (included in Docker Compose)
- SSL certificates (for production HTTPS)
Quick Start with Docker Compose
1. Clone and Setup
git clone <repository-url>
cd AI-Agent
2. Environment Configuration
Create a .env file:
REDIS_URL=redis://redis:6379
LOG_LEVEL=INFO
API_TOKEN=your_secure_token_here
3. Build and Run
# Build the platform
docker-compose build
# Start all services
docker-compose up -d
# Check status
docker-compose ps
4. Verify Deployment
# Check API health
curl http://localhost:8080/health
# Access dashboard
open http://localhost:8080/dashboard
# Check Prometheus metrics
curl http://localhost:9090
# Access Grafana (admin/admin)
open http://localhost:3000
Kubernetes Deployment
1. Build and Push Docker Image
# Build image
docker build -t multiagent-platform:latest .
# Tag for registry
docker tag multiagent-platform:latest your-registry/multiagent-platform:latest
# Push to registry
docker push your-registry/multiagent-platform:latest
2. Deploy to Kubernetes
# Create namespace
kubectl create namespace multi-agent-platform
# Apply configuration
kubectl apply -f multiagent-platform-k8s.yaml
# Check deployment status
kubectl get pods -n multi-agent-platform
kubectl get services -n multi-agent-platform
3. Configure Ingress
Update the ingress configuration with your domain:
# In multiagent-platform-k8s.yaml
spec:
rules:
- host: your-domain.com # Update this
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: platform-api-service
port:
number: 80
4. SSL Certificate Setup
For production, configure SSL certificates:
# Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.0/cert-manager.yaml
# Create ClusterIssuer for Let's Encrypt
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: your-email@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
EOF
API Usage Examples
1. Register an Agent
curl -X POST "http://localhost:8080/api/v2/agents/register" \
-H "Authorization: Bearer valid_token" \
-H "Content-Type: application/json" \
-d '{
"name": "MyAgent",
"version": "1.0.0",
"capabilities": ["REASONING", "COLLABORATION"],
"tags": ["custom", "production"],
"resources": {"cpu_cores": 1.0, "memory_mb": 512},
"description": "A custom reasoning agent"
}'
2. Submit a Task
curl -X POST "http://localhost:8080/api/v2/tasks/submit" \
-H "Authorization: Bearer valid_token" \
-H "Content-Type: application/json" \
-d '{
"task_type": "analysis",
"priority": 5,
"payload": {"data": "sample_data"},
"required_capabilities": ["REASONING"],
"deadline": "2024-01-01T12:00:00Z"
}'
3. List Agents
curl -X GET "http://localhost:8080/api/v2/agents" \
-H "Authorization: Bearer valid_token"
4. WebSocket Connection
const ws = new WebSocket('ws://localhost:8080/ws/dashboard');
ws.onopen = () => {
console.log('Connected to dashboard');
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
console.log('Received update:', data);
};
Monitoring and Observability
1. Prometheus Metrics
The platform exposes metrics at /metrics:
curl http://localhost:8080/metrics
Key metrics include:
agent_registrations_totaltasks_submitted_totaltask_execution_duration_secondsresource_utilization_percent
2. Grafana Dashboards
Access Grafana at http://localhost:3000 (admin/admin) and import dashboards for:
- Agent performance metrics
- Task execution statistics
- Resource utilization
- Error rates and latency
3. Alerting
Configure alerts in Prometheus for:
- High error rates
- Low agent availability
- Resource utilization thresholds
- Task latency issues
Performance Testing
1. Run Load Tests
# Install dependencies
pip install aiohttp
# Run performance tests
python performance_test.py
2. Custom Load Testing
from performance_test import PerformanceTester
import asyncio
async def custom_test():
tester = PerformanceTester("http://localhost:8080", "valid_token")
# Test with custom parameters
results = await tester.run_load_test(
agent_count=50,
task_count=500,
concurrent_tasks=25
)
print(f"Results: {results}")
asyncio.run(custom_test())
Security Considerations
1. Authentication
- Replace the simple token authentication with proper JWT or OAuth2
- Implement rate limiting per user/IP
- Use HTTPS in production
2. Network Security
- Configure firewall rules
- Use private networks for internal communication
- Implement proper CORS policies
3. Data Protection
- Encrypt sensitive data at rest
- Use secure communication channels
- Implement proper logging and audit trails
Troubleshooting
Common Issues
Redis Connection Failed
# Check Redis status docker-compose logs redis # Restart Redis docker-compose restart redisAPI Not Responding
# Check API logs docker-compose logs platform-api # Check health endpoint curl http://localhost:8080/healthWebSocket Connection Issues
# Check nginx configuration docker-compose logs nginx # Verify WebSocket proxy settingsHigh Resource Usage
# Check resource utilization docker stats # Scale up if needed docker-compose up -d --scale platform-api=5
Log Analysis
# View all logs
docker-compose logs -f
# View specific service logs
docker-compose logs -f platform-api
# Search for errors
docker-compose logs | grep ERROR
Scaling
Horizontal Scaling
# Scale API instances
docker-compose up -d --scale platform-api=5
# Or in Kubernetes
kubectl scale deployment platform-api --replicas=5 -n multi-agent-platform
Vertical Scaling
Update resource limits in docker-compose.yml:
resources:
limits:
cpus: '4.0'
memory: 4G
reservations:
cpus: '2.0'
memory: 2G
Backup and Recovery
1. Database Backup
# Backup Redis data
docker exec redis redis-cli BGSAVE
# Copy backup file
docker cp redis:/data/dump.rdb ./backup/
2. Configuration Backup
# Backup configuration files
tar -czf config-backup.tar.gz \
docker-compose.yml \
multiagent-platform-k8s.yaml \
prometheus.yml \
alerts.yml \
nginx.conf
Production Checklist
- SSL certificates configured
- Proper authentication implemented
- Monitoring and alerting set up
- Backup strategy in place
- Load balancing configured
- Rate limiting enabled
- Security headers configured
- Log aggregation set up
- Performance testing completed
- Disaster recovery plan ready
Support
For issues and questions:
- Check the troubleshooting section
- Review logs for error messages
- Consult the API documentation
- Open an issue in the repository
License
This deployment guide is part of the Multi-Agent Platform project. Please refer to the main project license for usage terms.