Spaces:

Nomearod
/

agentbench

Running

File size: 5,449 Bytes

a152b95

# Deploying FastAPI Applications

FastAPI applications are deployed using ASGI servers. This guide covers production deployment with Uvicorn, Gunicorn, Docker, and related infrastructure considerations.

## Uvicorn (Single Process)

Uvicorn is the recommended ASGI server for FastAPI. For development:

```bash
uvicorn main:app --reload --host 127.0.0.1 --port 8000
```

For production with a single process:

```bash
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 1 --log-level info
```

Key Uvicorn configuration options:

| Flag              | Default       | Description                              |
|-------------------|---------------|------------------------------------------|
| `--host`          | `127.0.0.1`   | Bind address                             |
| `--port`          | `8000`         | Bind port                                |
| `--workers`       | `1`            | Number of worker processes               |
| `--loop`          | `auto`         | Event loop (auto, asyncio, uvloop)       |
| `--http`          | `auto`         | HTTP protocol (auto, h11, httptools)     |
| `--ws`            | `auto`         | WebSocket protocol (auto, websockets, wsproto) |
| `--log-level`     | `info`         | Logging level (critical, error, warning, info, debug, trace) |
| `--access-log`    | `True`         | Enable/disable access log                |
| `--ws-max-size`   | `16777216`     | Max WebSocket message size (16 MB)       |
| `--timeout-keep-alive` | `5`       | Keep-alive timeout in seconds            |

Using `uvloop` and `httptools` (installed automatically on Linux) provides a 20-30% performance boost over the pure-Python `asyncio` and `h11` alternatives.

## Gunicorn with Uvicorn Workers

For production deployments requiring multiple worker processes, use Gunicorn as the process manager with Uvicorn workers:

```bash
gunicorn main:app \
    --workers 4 \
    --worker-class uvicorn.workers.UvicornWorker \
    --bind 0.0.0.0:8000 \
    --timeout 120 \
    --graceful-timeout 30 \
    --keep-alive 5 \
    --max-requests 1000 \
    --max-requests-jitter 50 \
    --access-logfile -
```

The recommended number of workers is `(2 * CPU_CORES) + 1`. For a server with 4 CPU cores, that is 9 workers. The `--max-requests 1000` flag restarts each worker after handling 1,000 requests, preventing memory leaks. The `--max-requests-jitter 50` adds a random offset (0-50) so workers do not all restart simultaneously.

The `--timeout 120` flag sets the maximum time (in seconds) a worker can take to handle a request before being killed and restarted. The default is 30 seconds. The `--graceful-timeout 30` gives workers 30 seconds to finish current requests during shutdown.

## Docker Deployment

A production-ready Dockerfile:

```dockerfile
FROM python:3.12-slim

WORKDIR /app

# Install dependencies first for layer caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY ./app ./app

# Create non-root user
RUN adduser --disabled-password --gecos "" appuser
USER appuser

EXPOSE 8000

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
```

Build and run:

```bash
docker build -t myapi:latest .
docker run -d --name myapi -p 8000:8000 -e DATABASE_URL=postgresql://... myapi:latest
```

The `python:3.12-slim` base image is approximately 120 MB, compared to the full `python:3.12` image at approximately 890 MB. For even smaller images, use `python:3.12-alpine` (approximately 50 MB), though it may require additional build dependencies for some Python packages.

## Proxy Headers and HTTPS

When running behind a reverse proxy (Nginx, Traefik, AWS ALB), configure Uvicorn to trust proxy headers:

```bash
uvicorn main:app \
    --host 0.0.0.0 \
    --port 8000 \
    --proxy-headers \
    --forwarded-allow-ips="*"
```

The `--proxy-headers` flag tells Uvicorn to read `X-Forwarded-For` and `X-Forwarded-Proto` headers from the proxy. The `--forwarded-allow-ips` flag specifies which proxy IPs are trusted. Using `"*"` trusts all proxies (acceptable when the application is not directly exposed to the internet).

An Nginx reverse proxy configuration:

```nginx
upstream fastapi_backend {
    server 127.0.0.1:8000;
}

server {
    listen 443 ssl;
    server_name api.example.com;

    ssl_certificate /etc/ssl/certs/api.example.com.pem;
    ssl_certificate_key /etc/ssl/private/api.example.com.key;

    location / {
        proxy_pass http://fastapi_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_buffering off;
    }
}
```

Setting `proxy_buffering off` ensures streamed responses (like SSE or large file downloads) are forwarded immediately rather than buffered by Nginx.

## Health Checks

Include a health check endpoint for container orchestrators:

```python
@app.get("/health", status_code=200)
async def health_check():
    return {"status": "healthy"}
```

Docker health check configuration:

```dockerfile
HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=10s \
    CMD curl -f http://localhost:8000/health || exit 1
```

This checks health every 30 seconds, allows 10 seconds per check, retries 3 times before marking unhealthy, and waits 10 seconds after container start before the first check.