Spaces:

Nomearod
/

agentbench

Running

App Files Files Community

agentbench / data /tech_docs /fastapi_deployment.md

Nomearod

feat: Day 4 — corpus, ingest script, first 10 golden questions

a152b95 about 1 month ago

preview code

raw

history blame contribute delete

5.45 kB

	# Deploying FastAPI Applications

	FastAPI applications are deployed using ASGI servers. This guide covers production deployment with Uvicorn, Gunicorn, Docker, and related infrastructure considerations.

	## Uvicorn (Single Process)

	Uvicorn is the recommended ASGI server for FastAPI. For development:

	```bash
	uvicorn main:app --reload --host 127.0.0.1 --port 8000
	```

	For production with a single process:

	```bash
	uvicorn main:app --host 0.0.0.0 --port 8000 --workers 1 --log-level info
	```

	Key Uvicorn configuration options:

	\| Flag \| Default \| Description \|
	\|-------------------\|---------------\|------------------------------------------\|
	\| `--host` \| `127.0.0.1` \| Bind address \|
	\| `--port` \| `8000` \| Bind port \|
	\| `--workers` \| `1` \| Number of worker processes \|
	\| `--loop` \| `auto` \| Event loop (auto, asyncio, uvloop) \|
	\| `--http` \| `auto` \| HTTP protocol (auto, h11, httptools) \|
	\| `--ws` \| `auto` \| WebSocket protocol (auto, websockets, wsproto) \|
	\| `--log-level` \| `info` \| Logging level (critical, error, warning, info, debug, trace) \|
	\| `--access-log` \| `True` \| Enable/disable access log \|
	\| `--ws-max-size` \| `16777216` \| Max WebSocket message size (16 MB) \|
	\| `--timeout-keep-alive` \| `5` \| Keep-alive timeout in seconds \|

	Using `uvloop` and `httptools` (installed automatically on Linux) provides a 20-30% performance boost over the pure-Python `asyncio` and `h11` alternatives.

	## Gunicorn with Uvicorn Workers

	For production deployments requiring multiple worker processes, use Gunicorn as the process manager with Uvicorn workers:

	```bash
	gunicorn main:app \
	--workers 4 \
	--worker-class uvicorn.workers.UvicornWorker \
	--bind 0.0.0.0:8000 \
	--timeout 120 \
	--graceful-timeout 30 \
	--keep-alive 5 \
	--max-requests 1000 \
	--max-requests-jitter 50 \
	--access-logfile -
	```

	The recommended number of workers is `(2 * CPU_CORES) + 1`. For a server with 4 CPU cores, that is 9 workers. The `--max-requests 1000` flag restarts each worker after handling 1,000 requests, preventing memory leaks. The `--max-requests-jitter 50` adds a random offset (0-50) so workers do not all restart simultaneously.

	The `--timeout 120` flag sets the maximum time (in seconds) a worker can take to handle a request before being killed and restarted. The default is 30 seconds. The `--graceful-timeout 30` gives workers 30 seconds to finish current requests during shutdown.

	## Docker Deployment

	A production-ready Dockerfile:

	```dockerfile
	FROM python:3.12-slim

	WORKDIR /app

	# Install dependencies first for layer caching
	COPY requirements.txt .
	RUN pip install --no-cache-dir -r requirements.txt

	# Copy application code
	COPY ./app ./app

	# Create non-root user
	RUN adduser --disabled-password --gecos "" appuser
	USER appuser

	EXPOSE 8000

	CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
	```

	Build and run:

	```bash
	docker build -t myapi:latest .
	docker run -d --name myapi -p 8000:8000 -e DATABASE_URL=postgresql://... myapi:latest
	```

	The `python:3.12-slim` base image is approximately 120 MB, compared to the full `python:3.12` image at approximately 890 MB. For even smaller images, use `python:3.12-alpine` (approximately 50 MB), though it may require additional build dependencies for some Python packages.

	## Proxy Headers and HTTPS

	When running behind a reverse proxy (Nginx, Traefik, AWS ALB), configure Uvicorn to trust proxy headers:

	```bash
	uvicorn main:app \
	--host 0.0.0.0 \
	--port 8000 \
	--proxy-headers \
	--forwarded-allow-ips="*"
	```

	The `--proxy-headers` flag tells Uvicorn to read `X-Forwarded-For` and `X-Forwarded-Proto` headers from the proxy. The `--forwarded-allow-ips` flag specifies which proxy IPs are trusted. Using `"*"` trusts all proxies (acceptable when the application is not directly exposed to the internet).

	An Nginx reverse proxy configuration:

	```nginx
	upstream fastapi_backend {
	server 127.0.0.1:8000;
	}

	server {
	listen 443 ssl;
	server_name api.example.com;

	ssl_certificate /etc/ssl/certs/api.example.com.pem;
	ssl_certificate_key /etc/ssl/private/api.example.com.key;

	location / {
	proxy_pass http://fastapi_backend;
	proxy_set_header Host $host;
	proxy_set_header X-Real-IP $remote_addr;
	proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
	proxy_set_header X-Forwarded-Proto $scheme;
	proxy_buffering off;
	}
	}
	```

	Setting `proxy_buffering off` ensures streamed responses (like SSE or large file downloads) are forwarded immediately rather than buffered by Nginx.

	## Health Checks

	Include a health check endpoint for container orchestrators:

	```python
	@app.get("/health", status_code=200)
	async def health_check():
	return {"status": "healthy"}
	```

	Docker health check configuration:

	```dockerfile
	HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=10s \
	CMD curl -f http://localhost:8000/health \|\| exit 1
	```

	This checks health every 30 seconds, allows 10 seconds per check, retries 3 times before marking unhealthy, and waits 10 seconds after container start before the first check.