Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.2.0
Deployment Guide
Prerequisites
- Docker 20.10+ and Docker Compose 2.0+
- Python 3.9+ (for local deployment)
- 4GB RAM minimum (8GB recommended)
- 10GB disk space for models and cache
Quick Deploy with Docker
1. Prepare Environment
# Clone repository
git clone https://github.com/yourusername/writing-studio.git
cd writing-studio
# Copy and configure environment
cp .env.example .env
nano .env # Edit configuration
2. Deploy Application
# Start application
docker-compose up -d
# View logs
docker-compose logs -f
# Check status
docker-compose ps
3. Verify Deployment
# Check application health
curl http://localhost:7860
# Check metrics endpoint
curl http://localhost:8000
Production Deployment
Environment Configuration
# .env for production
ENVIRONMENT=production
DEBUG=false
LOG_LEVEL=INFO
# Security
SECRET_KEY=<generate-with-openssl-rand-base64-32>
ALLOWED_ORIGINS=https://yourdomain.com
ENABLE_AUTH=true
RATE_LIMIT_PER_MINUTE=30
# Performance
ENABLE_CACHE=true
CACHE_MAX_SIZE=1000
SERVER_WORKERS=4
# Monitoring
ENABLE_METRICS=true
LOG_FORMAT=json
Reverse Proxy Setup (Nginx)
# /etc/nginx/sites-available/writing-studio
upstream writing_studio {
server 127.0.0.1:7860;
}
server {
listen 80;
server_name writing.yourdomain.com;
# Redirect to HTTPS
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name writing.yourdomain.com;
# SSL configuration
ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;
# Security headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
# Proxy settings
location / {
proxy_pass http://writing_studio;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
}
# Metrics endpoint (restrict access)
location /metrics {
deny all;
}
}
SSL/TLS Setup
# Using Let's Encrypt
sudo apt-get install certbot python3-certbot-nginx
sudo certbot --nginx -d writing.yourdomain.com
Cloud Deployments
AWS ECS Deployment
- Build and Push Image
# Tag for ECR
docker tag writing-studio:latest \
<account-id>.dkr.ecr.<region>.amazonaws.com/writing-studio:latest
# Push to ECR
docker push <account-id>.dkr.ecr.<region>.amazonaws.com/writing-studio:latest
- ECS Task Definition (
task-definition.json)
{
"family": "writing-studio",
"networkMode": "awsvpc",
"containerDefinitions": [
{
"name": "writing-studio",
"image": "<account-id>.dkr.ecr.<region>.amazonaws.com/writing-studio:latest",
"portMappings": [
{"containerPort": 7860, "protocol": "tcp"},
{"containerPort": 8000, "protocol": "tcp"}
],
"environment": [
{"name": "ENVIRONMENT", "value": "production"},
{"name": "LOG_LEVEL", "value": "INFO"}
],
"secrets": [
{
"name": "SECRET_KEY",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:writing-studio/secret-key"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/writing-studio",
"awslogs-region": "<region>",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:7860 || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3
}
}
],
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024",
"memory": "4096"
}
Google Cloud Run
# Build for Cloud Run
gcloud builds submit --tag gcr.io/PROJECT-ID/writing-studio
# Deploy
gcloud run deploy writing-studio \
--image gcr.io/PROJECT-ID/writing-studio \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--memory 4Gi \
--cpu 2 \
--port 7860 \
--set-env-vars ENVIRONMENT=production
Kubernetes Deployment
deployment.yaml: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: writing-studio spec: replicas: 3 selector: matchLabels: app: writing-studio template: metadata: labels: app: writing-studio spec: containers: - name: writing-studio image: writing-studio:latest ports: - containerPort: 7860 name: http - containerPort: 8000 name: metrics env: - name: ENVIRONMENT value: "production" - name: SECRET_KEY valueFrom: secretKeyRef: name: writing-studio-secrets key: secret-key resources: requests: memory: "2Gi" cpu: "1000m" limits: memory: "4Gi" cpu: "2000m" livenessProbe: httpGet: path: / port: 7860 initialDelaySeconds: 60 periodSeconds: 30 readinessProbe: httpGet: path: / port: 7860 initialDelaySeconds: 30 periodSeconds: 10
apiVersion: v1 kind: Service metadata: name: writing-studio spec: selector: app: writing-studio ports:
- name: http port: 80 targetPort: 7860
- name: metrics port: 8000 targetPort: 8000 type: LoadBalancer
## Monitoring Setup
### Prometheus Configuration
```yaml
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'writing-studio'
static_configs:
- targets: ['writing-studio:8000']
metrics_path: '/metrics'
Grafana Dashboard
Import the provided dashboard:
# Import from grafana.com or use provided JSON
curl -X POST http://admin:admin@localhost:3000/api/dashboards/db \
-H "Content-Type: application/json" \
-d @configs/grafana-dashboard.json
Backup and Recovery
Data Backup
# Backup logs
tar -czf logs-backup-$(date +%Y%m%d).tar.gz logs/
# Backup models
tar -czf models-backup-$(date +%Y%m%d).tar.gz models/
# Backup configuration
cp .env .env.backup
Database Backup (if using)
# PostgreSQL
pg_dump writing_studio > backup-$(date +%Y%m%d).sql
# Restore
psql writing_studio < backup-20240101.sql
Scaling Strategies
Horizontal Scaling
# Docker Compose
docker-compose up -d --scale app=3
# Kubernetes
kubectl scale deployment writing-studio --replicas=5
Load Balancing
upstream writing_studio {
least_conn;
server app1:7860 weight=3;
server app2:7860 weight=3;
server app3:7860 weight=2;
}
Troubleshooting
Common Issues
Container won't start:
# Check logs
docker-compose logs app
# Check resources
docker stats
# Verify environment
docker-compose config
High memory usage:
# Reduce cache size
CACHE_MAX_SIZE=50
# Use smaller model
DEFAULT_MODEL=distilgpt2
# Limit workers
SERVER_WORKERS=2
Slow response times:
# Enable caching
ENABLE_CACHE=true
# Increase workers
SERVER_WORKERS=8
# Use faster model
DEFAULT_MODEL=distilgpt2
Security Checklist
- Change default SECRET_KEY
- Enable HTTPS/TLS
- Configure CORS properly
- Enable rate limiting
- Set up authentication
- Restrict metrics endpoint
- Regular security updates
- Monitor logs for suspicious activity
- Use non-root Docker user
- Implement network policies
Maintenance
Regular Tasks
# Update dependencies
pip install --upgrade -r requirements.txt
# Clean old logs
find logs/ -name "*.log" -mtime +30 -delete
# Clear old models
find models/ -name "*" -mtime +90 -delete
# Restart service
docker-compose restart app
Updates
# Pull latest changes
git pull origin main
# Rebuild image
docker-compose build
# Deploy with zero downtime
docker-compose up -d --no-deps --build app