agentbench / data /tech_docs /fastapi_deployment.md
Nomearod's picture
feat: Day 4 — corpus, ingest script, first 10 golden questions
a152b95
# Deploying FastAPI Applications
FastAPI applications are deployed using ASGI servers. This guide covers production deployment with Uvicorn, Gunicorn, Docker, and related infrastructure considerations.
## Uvicorn (Single Process)
Uvicorn is the recommended ASGI server for FastAPI. For development:
```bash
uvicorn main:app --reload --host 127.0.0.1 --port 8000
```
For production with a single process:
```bash
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 1 --log-level info
```
Key Uvicorn configuration options:
| Flag | Default | Description |
|-------------------|---------------|------------------------------------------|
| `--host` | `127.0.0.1` | Bind address |
| `--port` | `8000` | Bind port |
| `--workers` | `1` | Number of worker processes |
| `--loop` | `auto` | Event loop (auto, asyncio, uvloop) |
| `--http` | `auto` | HTTP protocol (auto, h11, httptools) |
| `--ws` | `auto` | WebSocket protocol (auto, websockets, wsproto) |
| `--log-level` | `info` | Logging level (critical, error, warning, info, debug, trace) |
| `--access-log` | `True` | Enable/disable access log |
| `--ws-max-size` | `16777216` | Max WebSocket message size (16 MB) |
| `--timeout-keep-alive` | `5` | Keep-alive timeout in seconds |
Using `uvloop` and `httptools` (installed automatically on Linux) provides a 20-30% performance boost over the pure-Python `asyncio` and `h11` alternatives.
## Gunicorn with Uvicorn Workers
For production deployments requiring multiple worker processes, use Gunicorn as the process manager with Uvicorn workers:
```bash
gunicorn main:app \
--workers 4 \
--worker-class uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:8000 \
--timeout 120 \
--graceful-timeout 30 \
--keep-alive 5 \
--max-requests 1000 \
--max-requests-jitter 50 \
--access-logfile -
```
The recommended number of workers is `(2 * CPU_CORES) + 1`. For a server with 4 CPU cores, that is 9 workers. The `--max-requests 1000` flag restarts each worker after handling 1,000 requests, preventing memory leaks. The `--max-requests-jitter 50` adds a random offset (0-50) so workers do not all restart simultaneously.
The `--timeout 120` flag sets the maximum time (in seconds) a worker can take to handle a request before being killed and restarted. The default is 30 seconds. The `--graceful-timeout 30` gives workers 30 seconds to finish current requests during shutdown.
## Docker Deployment
A production-ready Dockerfile:
```dockerfile
FROM python:3.12-slim
WORKDIR /app
# Install dependencies first for layer caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY ./app ./app
# Create non-root user
RUN adduser --disabled-password --gecos "" appuser
USER appuser
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
```
Build and run:
```bash
docker build -t myapi:latest .
docker run -d --name myapi -p 8000:8000 -e DATABASE_URL=postgresql://... myapi:latest
```
The `python:3.12-slim` base image is approximately 120 MB, compared to the full `python:3.12` image at approximately 890 MB. For even smaller images, use `python:3.12-alpine` (approximately 50 MB), though it may require additional build dependencies for some Python packages.
## Proxy Headers and HTTPS
When running behind a reverse proxy (Nginx, Traefik, AWS ALB), configure Uvicorn to trust proxy headers:
```bash
uvicorn main:app \
--host 0.0.0.0 \
--port 8000 \
--proxy-headers \
--forwarded-allow-ips="*"
```
The `--proxy-headers` flag tells Uvicorn to read `X-Forwarded-For` and `X-Forwarded-Proto` headers from the proxy. The `--forwarded-allow-ips` flag specifies which proxy IPs are trusted. Using `"*"` trusts all proxies (acceptable when the application is not directly exposed to the internet).
An Nginx reverse proxy configuration:
```nginx
upstream fastapi_backend {
server 127.0.0.1:8000;
}
server {
listen 443 ssl;
server_name api.example.com;
ssl_certificate /etc/ssl/certs/api.example.com.pem;
ssl_certificate_key /etc/ssl/private/api.example.com.key;
location / {
proxy_pass http://fastapi_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_buffering off;
}
}
```
Setting `proxy_buffering off` ensures streamed responses (like SSE or large file downloads) are forwarded immediately rather than buffered by Nginx.
## Health Checks
Include a health check endpoint for container orchestrators:
```python
@app.get("/health", status_code=200)
async def health_check():
return {"status": "healthy"}
```
Docker health check configuration:
```dockerfile
HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=10s \
CMD curl -f http://localhost:8000/health || exit 1
```
This checks health every 30 seconds, allows 10 seconds per check, retries 3 times before marking unhealthy, and waits 10 seconds after container start before the first check.