# 🚀 Deployment Guide

This guide covers deploying the LLM API Backend to various platforms.

## 📋 Prerequisites

- Node.js 18+ installed
- Encore CLI installed (`npm install -g encore.dev`)
- Git installed
- API keys for your chosen LLM provider

---

## 🌟 Hugging Face Spaces (Recommended for Demos)

### Step 1: Create a Space

1. Go to https://huggingface.co/spaces
2. Click **"Create new Space"**
3. Settings:
   - **Space name:** `llm-api-backend` (or your choice)
   - **SDK:** Docker
   - **Visibility:** Public or Private
4. Click **Create Space**

### Step 2: Clone and Push

```bash
# Clone your new Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME

# Copy all files from this project
cp -r /path/to/llm-api-backend/* .

# Commit and push
git add .
git commit -m "Initial deployment"
git push
```

### Step 3: Configure Secrets

1. Go to your Space page
2. Click **Settings** → **Repository secrets**
3. Add the following secrets:

**For Hugging Face Provider:**
```
LLMProvider = huggingface
HuggingFaceAPIKey = hf_your_token_here
DefaultModel = mistralai/Mistral-7B-Instruct-v0.2
```

**For Ollama Provider (requires custom Docker setup):**
```
LLMProvider = ollama
OllamaBaseURL = http://localhost:11434
DefaultModel = llama3
```

### Step 4: Wait for Build

- Hugging Face will automatically build your Docker container
- Watch the build logs in the Space interface
- Once complete, your API is live!

### Step 5: Test Your API

```bash
# Replace with your actual Space URL
export SPACE_URL="https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space"

# Test chat endpoint
curl -X POST $SPACE_URL/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!"}'

# Test health endpoint
curl $SPACE_URL/health
```

---

## ☁️ Encore Cloud (Recommended for Production)

### Step 1: Install Encore

```bash
npm install -g encore.dev
```

### Step 2: Create Encore App

```bash
# If starting fresh
encore app create

# Or link existing app
encore app link
```

### Step 3: Set Secrets

```bash
# For Hugging Face
encore secret set LLMProvider huggingface
encore secret set HuggingFaceAPIKey hf_your_token_here
encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2

# For Ollama (local development)
encore secret set LLMProvider ollama
encore secret set OllamaBaseURL http://localhost:11434
encore secret set DefaultModel llama3
```

### Step 4: Deploy

```bash
# Deploy to staging
encore deploy

# Deploy to production
encore deploy --env production
```

### Step 5: Access Your API

```bash
# Your API will be available at:
# Staging: https://staging-YOUR_APP.encr.app
# Production: https://prod-YOUR_APP.encr.app

# Test it
curl https://staging-YOUR_APP.encr.app/health
```

---

## 🐳 Docker (Self-Hosted)

### Step 1: Build Image

```bash
docker build -t llm-api-backend .
```

### Step 2: Run Container

**Using Hugging Face:**
```bash
docker run -d \
  -p 7860:7860 \
  -e LLMProvider=huggingface \
  -e HuggingFaceAPIKey=hf_your_token \
  -e DefaultModel=mistralai/Mistral-7B-Instruct-v0.2 \
  --name llm-api \
  llm-api-backend
```

**Using Ollama (with host network):**
```bash
docker run -d \
  --network host \
  -e LLMProvider=ollama \
  -e OllamaBaseURL=http://localhost:11434 \
  -e DefaultModel=llama3 \
  --name llm-api \
  llm-api-backend
```

### Step 3: Test

```bash
curl http://localhost:7860/health
```

---

## 🖥️ VPS / Bare Metal

### Step 1: Install Dependencies

```bash
# Install Node.js 20
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

# Install Encore CLI
npm install -g encore.dev
```

### Step 2: Clone Repository

```bash
git clone https://github.com/YOUR_USERNAME/llm-api-backend.git
cd llm-api-backend
```

### Step 3: Configure Environment

```bash
# Copy example env file
cp .env.example .env

# Edit with your values
nano .env
```

### Step 4: Set Encore Secrets

```bash
encore secret set LLMProvider huggingface
encore secret set HuggingFaceAPIKey hf_your_token
encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2
```

### Step 5: Run with PM2 (Production)

```bash
# Install PM2
npm install -g pm2

# Start application
pm2 start "encore run --port 8080" --name llm-api

# Save PM2 configuration
pm2 save

# Enable startup on boot
pm2 startup
```

### Step 6: Configure Nginx (Optional)

```nginx
server {
    listen 80;
    server_name your-domain.com;

    location / {
        proxy_pass http://localhost:8080;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
    }
}
```

---

## 🔐 Security Checklist

Before deploying to production:

- [ ] All secrets configured properly
- [ ] API keys have appropriate permissions
- [ ] CORS configured for your frontend domains
- [ ] Rate limiting enabled (add middleware)
- [ ] HTTPS enabled (Encore/HF Spaces handle this)
- [ ] Environment variables not committed to git
- [ ] Monitoring and logging set up
- [ ] Error tracking configured

---

## 📊 Monitoring

### Encore Cloud
- Built-in dashboard at https://app.encore.dev
- Real-time traces, logs, and metrics
- Performance monitoring
- Error tracking

### Hugging Face Spaces
- View logs in Space interface
- Use `/health` endpoint for uptime monitoring
- Configure external monitoring tools

### Self-Hosted
- Use `/health` endpoint
- Set up monitoring tools like:
  - Prometheus + Grafana
  - Datadog
  - New Relic
  - Sentry for errors

---

## 🆘 Troubleshooting

### Build Failures on HF Spaces

**Issue:** Docker build fails
```bash
# Check Dockerfile syntax
# Ensure all required files are committed
# Check Space build logs
```

### "Secret not set" Errors

**Issue:** Application can't access secrets
```bash
# On Encore: Use 'encore secret set' command
# On HF Spaces: Configure in Space settings
# On Docker: Pass as environment variables (-e flag)
```

### Model Loading Timeout

**Issue:** HF models take too long to load
```bash
# Solution: Wait 30-60 seconds for cold start
# Use smaller models for faster loading
# Check model availability on HF
```

### Connection Refused (Ollama)

**Issue:** Can't connect to Ollama
```bash
# Ensure Ollama is running: ollama serve
# Check OllamaBaseURL is correct
# For Docker: Use --network host
```

---

## 🔄 Updates and Maintenance

### Updating on Hugging Face Spaces

```bash
git pull origin main
# Make your changes
git add .
git commit -m "Update: description"
git push
```

### Updating on Encore Cloud

```bash
# Make changes
git commit -am "Update: description"
encore deploy
```

### Updating Docker

```bash
# Rebuild image
docker build -t llm-api-backend .

# Stop old container
docker stop llm-api
docker rm llm-api

# Run new container
docker run -d [your flags] llm-api-backend
```

---

## 📚 Additional Resources

- [Encore Documentation](https://encore.dev/docs)
- [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces)
- [Docker Documentation](https://docs.docker.com)
- [Ollama Documentation](https://ollama.ai/docs)

---

**Need help?** Open an issue or check the main README.md for support options.