Spaces:

cygon24
/

llm-api-backend

Runtime error

App Files Files Community

llm-api-backend / DEPLOY.md

cygon

intial commit

86042ad 3 months ago

preview code

raw

history blame contribute delete

7.25 kB

	# 🚀 Deployment Guide

	This guide covers deploying the LLM API Backend to various platforms.

	## 📋 Prerequisites

	- Node.js 18+ installed
	- Encore CLI installed (`npm install -g encore.dev`)
	- Git installed
	- API keys for your chosen LLM provider

	---

	## 🌟 Hugging Face Spaces (Recommended for Demos)

	### Step 1: Create a Space

	1. Go to https://huggingface.co/spaces
	2. Click "Create new Space"
	3. Settings:
	- Space name: `llm-api-backend` (or your choice)
	- SDK: Docker
	- Visibility: Public or Private
	4. Click Create Space

	### Step 2: Clone and Push

	```bash
	# Clone your new Space
	git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
	cd YOUR_SPACE_NAME

	# Copy all files from this project
	cp -r /path/to/llm-api-backend/* .

	# Commit and push
	git add .
	git commit -m "Initial deployment"
	git push
	```

	### Step 3: Configure Secrets

	1. Go to your Space page
	2. Click Settings → Repository secrets
	3. Add the following secrets:

	For Hugging Face Provider:
	```
	LLMProvider = huggingface
	HuggingFaceAPIKey = hf_your_token_here
	DefaultModel = mistralai/Mistral-7B-Instruct-v0.2
	```

	For Ollama Provider (requires custom Docker setup):
	```
	LLMProvider = ollama
	OllamaBaseURL = http://localhost:11434
	DefaultModel = llama3
	```

	### Step 4: Wait for Build

	- Hugging Face will automatically build your Docker container
	- Watch the build logs in the Space interface
	- Once complete, your API is live!

	### Step 5: Test Your API

	```bash
	# Replace with your actual Space URL
	export SPACE_URL="https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space"

	# Test chat endpoint
	curl -X POST $SPACE_URL/chat \
	-H "Content-Type: application/json" \
	-d '{"message": "Hello!"}'

	# Test health endpoint
	curl $SPACE_URL/health
	```

	---

	## ☁️ Encore Cloud (Recommended for Production)

	### Step 1: Install Encore

	```bash
	npm install -g encore.dev
	```

	### Step 2: Create Encore App

	```bash
	# If starting fresh
	encore app create

	# Or link existing app
	encore app link
	```

	### Step 3: Set Secrets

	```bash
	# For Hugging Face
	encore secret set LLMProvider huggingface
	encore secret set HuggingFaceAPIKey hf_your_token_here
	encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2

	# For Ollama (local development)
	encore secret set LLMProvider ollama
	encore secret set OllamaBaseURL http://localhost:11434
	encore secret set DefaultModel llama3
	```

	### Step 4: Deploy

	```bash
	# Deploy to staging
	encore deploy

	# Deploy to production
	encore deploy --env production
	```

	### Step 5: Access Your API

	```bash
	# Your API will be available at:
	# Staging: https://staging-YOUR_APP.encr.app
	# Production: https://prod-YOUR_APP.encr.app

	# Test it
	curl https://staging-YOUR_APP.encr.app/health
	```

	---

	## 🐳 Docker (Self-Hosted)

	### Step 1: Build Image

	```bash
	docker build -t llm-api-backend .
	```

	### Step 2: Run Container

	Using Hugging Face:
	```bash
	docker run -d \
	-p 7860:7860 \
	-e LLMProvider=huggingface \
	-e HuggingFaceAPIKey=hf_your_token \
	-e DefaultModel=mistralai/Mistral-7B-Instruct-v0.2 \
	--name llm-api \
	llm-api-backend
	```

	Using Ollama (with host network):
	```bash
	docker run -d \
	--network host \
	-e LLMProvider=ollama \
	-e OllamaBaseURL=http://localhost:11434 \
	-e DefaultModel=llama3 \
	--name llm-api \
	llm-api-backend
	```

	### Step 3: Test

	```bash
	curl http://localhost:7860/health
	```

	---

	## 🖥️ VPS / Bare Metal

	### Step 1: Install Dependencies

	```bash
	# Install Node.js 20
	curl -fsSL https://deb.nodesource.com/setup_20.x \| sudo -E bash -
	sudo apt-get install -y nodejs

	# Install Encore CLI
	npm install -g encore.dev
	```

	### Step 2: Clone Repository

	```bash
	git clone https://github.com/YOUR_USERNAME/llm-api-backend.git
	cd llm-api-backend
	```

	### Step 3: Configure Environment

	```bash
	# Copy example env file
	cp .env.example .env

	# Edit with your values
	nano .env
	```

	### Step 4: Set Encore Secrets

	```bash
	encore secret set LLMProvider huggingface
	encore secret set HuggingFaceAPIKey hf_your_token
	encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2
	```

	### Step 5: Run with PM2 (Production)

	```bash
	# Install PM2
	npm install -g pm2

	# Start application
	pm2 start "encore run --port 8080" --name llm-api

	# Save PM2 configuration
	pm2 save

	# Enable startup on boot
	pm2 startup
	```

	### Step 6: Configure Nginx (Optional)

	```nginx
	server {
	listen 80;
	server_name your-domain.com;

	location / {
	proxy_pass http://localhost:8080;
	proxy_http_version 1.1;
	proxy_set_header Upgrade $http_upgrade;
	proxy_set_header Connection 'upgrade';
	proxy_set_header Host $host;
	proxy_cache_bypass $http_upgrade;
	}
	}
	```

	---

	## 🔐 Security Checklist

	Before deploying to production:

	- [ ] All secrets configured properly
	- [ ] API keys have appropriate permissions
	- [ ] CORS configured for your frontend domains
	- [ ] Rate limiting enabled (add middleware)
	- [ ] HTTPS enabled (Encore/HF Spaces handle this)
	- [ ] Environment variables not committed to git
	- [ ] Monitoring and logging set up
	- [ ] Error tracking configured

	---

	## 📊 Monitoring

	### Encore Cloud
	- Built-in dashboard at https://app.encore.dev
	- Real-time traces, logs, and metrics
	- Performance monitoring
	- Error tracking

	### Hugging Face Spaces
	- View logs in Space interface
	- Use `/health` endpoint for uptime monitoring
	- Configure external monitoring tools

	### Self-Hosted
	- Use `/health` endpoint
	- Set up monitoring tools like:
	- Prometheus + Grafana
	- Datadog
	- New Relic
	- Sentry for errors

	---

	## 🆘 Troubleshooting

	### Build Failures on HF Spaces

	Issue: Docker build fails
	```bash
	# Check Dockerfile syntax
	# Ensure all required files are committed
	# Check Space build logs
	```

	### "Secret not set" Errors

	Issue: Application can't access secrets
	```bash
	# On Encore: Use 'encore secret set' command
	# On HF Spaces: Configure in Space settings
	# On Docker: Pass as environment variables (-e flag)
	```

	### Model Loading Timeout

	Issue: HF models take too long to load
	```bash
	# Solution: Wait 30-60 seconds for cold start
	# Use smaller models for faster loading
	# Check model availability on HF
	```

	### Connection Refused (Ollama)

	Issue: Can't connect to Ollama
	```bash
	# Ensure Ollama is running: ollama serve
	# Check OllamaBaseURL is correct
	# For Docker: Use --network host
	```

	---

	## 🔄 Updates and Maintenance

	### Updating on Hugging Face Spaces

	```bash
	git pull origin main
	# Make your changes
	git add .
	git commit -m "Update: description"
	git push
	```

	### Updating on Encore Cloud

	```bash
	# Make changes
	git commit -am "Update: description"
	encore deploy
	```

	### Updating Docker

	```bash
	# Rebuild image
	docker build -t llm-api-backend .

	# Stop old container
	docker stop llm-api
	docker rm llm-api

	# Run new container
	docker run -d [your flags] llm-api-backend
	```

	---

	## 📚 Additional Resources

	- [Encore Documentation](https://encore.dev/docs)
	- [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces)
	- [Docker Documentation](https://docs.docker.com)
	- [Ollama Documentation](https://ollama.ai/docs)

	---

	Need help? Open an issue or check the main README.md for support options.