Spaces:
Runtime error
Runtime error
| # π Deployment Guide | |
| This guide covers deploying the LLM API Backend to various platforms. | |
| ## π Prerequisites | |
| - Node.js 18+ installed | |
| - Encore CLI installed (`npm install -g encore.dev`) | |
| - Git installed | |
| - API keys for your chosen LLM provider | |
| --- | |
| ## π Hugging Face Spaces (Recommended for Demos) | |
| ### Step 1: Create a Space | |
| 1. Go to https://huggingface.co/spaces | |
| 2. Click **"Create new Space"** | |
| 3. Settings: | |
| - **Space name:** `llm-api-backend` (or your choice) | |
| - **SDK:** Docker | |
| - **Visibility:** Public or Private | |
| 4. Click **Create Space** | |
| ### Step 2: Clone and Push | |
| ```bash | |
| # Clone your new Space | |
| git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME | |
| cd YOUR_SPACE_NAME | |
| # Copy all files from this project | |
| cp -r /path/to/llm-api-backend/* . | |
| # Commit and push | |
| git add . | |
| git commit -m "Initial deployment" | |
| git push | |
| ``` | |
| ### Step 3: Configure Secrets | |
| 1. Go to your Space page | |
| 2. Click **Settings** β **Repository secrets** | |
| 3. Add the following secrets: | |
| **For Hugging Face Provider:** | |
| ``` | |
| LLMProvider = huggingface | |
| HuggingFaceAPIKey = hf_your_token_here | |
| DefaultModel = mistralai/Mistral-7B-Instruct-v0.2 | |
| ``` | |
| **For Ollama Provider (requires custom Docker setup):** | |
| ``` | |
| LLMProvider = ollama | |
| OllamaBaseURL = http://localhost:11434 | |
| DefaultModel = llama3 | |
| ``` | |
| ### Step 4: Wait for Build | |
| - Hugging Face will automatically build your Docker container | |
| - Watch the build logs in the Space interface | |
| - Once complete, your API is live! | |
| ### Step 5: Test Your API | |
| ```bash | |
| # Replace with your actual Space URL | |
| export SPACE_URL="https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space" | |
| # Test chat endpoint | |
| curl -X POST $SPACE_URL/chat \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"message": "Hello!"}' | |
| # Test health endpoint | |
| curl $SPACE_URL/health | |
| ``` | |
| --- | |
| ## βοΈ Encore Cloud (Recommended for Production) | |
| ### Step 1: Install Encore | |
| ```bash | |
| npm install -g encore.dev | |
| ``` | |
| ### Step 2: Create Encore App | |
| ```bash | |
| # If starting fresh | |
| encore app create | |
| # Or link existing app | |
| encore app link | |
| ``` | |
| ### Step 3: Set Secrets | |
| ```bash | |
| # For Hugging Face | |
| encore secret set LLMProvider huggingface | |
| encore secret set HuggingFaceAPIKey hf_your_token_here | |
| encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2 | |
| # For Ollama (local development) | |
| encore secret set LLMProvider ollama | |
| encore secret set OllamaBaseURL http://localhost:11434 | |
| encore secret set DefaultModel llama3 | |
| ``` | |
| ### Step 4: Deploy | |
| ```bash | |
| # Deploy to staging | |
| encore deploy | |
| # Deploy to production | |
| encore deploy --env production | |
| ``` | |
| ### Step 5: Access Your API | |
| ```bash | |
| # Your API will be available at: | |
| # Staging: https://staging-YOUR_APP.encr.app | |
| # Production: https://prod-YOUR_APP.encr.app | |
| # Test it | |
| curl https://staging-YOUR_APP.encr.app/health | |
| ``` | |
| --- | |
| ## π³ Docker (Self-Hosted) | |
| ### Step 1: Build Image | |
| ```bash | |
| docker build -t llm-api-backend . | |
| ``` | |
| ### Step 2: Run Container | |
| **Using Hugging Face:** | |
| ```bash | |
| docker run -d \ | |
| -p 7860:7860 \ | |
| -e LLMProvider=huggingface \ | |
| -e HuggingFaceAPIKey=hf_your_token \ | |
| -e DefaultModel=mistralai/Mistral-7B-Instruct-v0.2 \ | |
| --name llm-api \ | |
| llm-api-backend | |
| ``` | |
| **Using Ollama (with host network):** | |
| ```bash | |
| docker run -d \ | |
| --network host \ | |
| -e LLMProvider=ollama \ | |
| -e OllamaBaseURL=http://localhost:11434 \ | |
| -e DefaultModel=llama3 \ | |
| --name llm-api \ | |
| llm-api-backend | |
| ``` | |
| ### Step 3: Test | |
| ```bash | |
| curl http://localhost:7860/health | |
| ``` | |
| --- | |
| ## π₯οΈ VPS / Bare Metal | |
| ### Step 1: Install Dependencies | |
| ```bash | |
| # Install Node.js 20 | |
| curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash - | |
| sudo apt-get install -y nodejs | |
| # Install Encore CLI | |
| npm install -g encore.dev | |
| ``` | |
| ### Step 2: Clone Repository | |
| ```bash | |
| git clone https://github.com/YOUR_USERNAME/llm-api-backend.git | |
| cd llm-api-backend | |
| ``` | |
| ### Step 3: Configure Environment | |
| ```bash | |
| # Copy example env file | |
| cp .env.example .env | |
| # Edit with your values | |
| nano .env | |
| ``` | |
| ### Step 4: Set Encore Secrets | |
| ```bash | |
| encore secret set LLMProvider huggingface | |
| encore secret set HuggingFaceAPIKey hf_your_token | |
| encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2 | |
| ``` | |
| ### Step 5: Run with PM2 (Production) | |
| ```bash | |
| # Install PM2 | |
| npm install -g pm2 | |
| # Start application | |
| pm2 start "encore run --port 8080" --name llm-api | |
| # Save PM2 configuration | |
| pm2 save | |
| # Enable startup on boot | |
| pm2 startup | |
| ``` | |
| ### Step 6: Configure Nginx (Optional) | |
| ```nginx | |
| server { | |
| listen 80; | |
| server_name your-domain.com; | |
| location / { | |
| proxy_pass http://localhost:8080; | |
| proxy_http_version 1.1; | |
| proxy_set_header Upgrade $http_upgrade; | |
| proxy_set_header Connection 'upgrade'; | |
| proxy_set_header Host $host; | |
| proxy_cache_bypass $http_upgrade; | |
| } | |
| } | |
| ``` | |
| --- | |
| ## π Security Checklist | |
| Before deploying to production: | |
| - [ ] All secrets configured properly | |
| - [ ] API keys have appropriate permissions | |
| - [ ] CORS configured for your frontend domains | |
| - [ ] Rate limiting enabled (add middleware) | |
| - [ ] HTTPS enabled (Encore/HF Spaces handle this) | |
| - [ ] Environment variables not committed to git | |
| - [ ] Monitoring and logging set up | |
| - [ ] Error tracking configured | |
| --- | |
| ## π Monitoring | |
| ### Encore Cloud | |
| - Built-in dashboard at https://app.encore.dev | |
| - Real-time traces, logs, and metrics | |
| - Performance monitoring | |
| - Error tracking | |
| ### Hugging Face Spaces | |
| - View logs in Space interface | |
| - Use `/health` endpoint for uptime monitoring | |
| - Configure external monitoring tools | |
| ### Self-Hosted | |
| - Use `/health` endpoint | |
| - Set up monitoring tools like: | |
| - Prometheus + Grafana | |
| - Datadog | |
| - New Relic | |
| - Sentry for errors | |
| --- | |
| ## π Troubleshooting | |
| ### Build Failures on HF Spaces | |
| **Issue:** Docker build fails | |
| ```bash | |
| # Check Dockerfile syntax | |
| # Ensure all required files are committed | |
| # Check Space build logs | |
| ``` | |
| ### "Secret not set" Errors | |
| **Issue:** Application can't access secrets | |
| ```bash | |
| # On Encore: Use 'encore secret set' command | |
| # On HF Spaces: Configure in Space settings | |
| # On Docker: Pass as environment variables (-e flag) | |
| ``` | |
| ### Model Loading Timeout | |
| **Issue:** HF models take too long to load | |
| ```bash | |
| # Solution: Wait 30-60 seconds for cold start | |
| # Use smaller models for faster loading | |
| # Check model availability on HF | |
| ``` | |
| ### Connection Refused (Ollama) | |
| **Issue:** Can't connect to Ollama | |
| ```bash | |
| # Ensure Ollama is running: ollama serve | |
| # Check OllamaBaseURL is correct | |
| # For Docker: Use --network host | |
| ``` | |
| --- | |
| ## π Updates and Maintenance | |
| ### Updating on Hugging Face Spaces | |
| ```bash | |
| git pull origin main | |
| # Make your changes | |
| git add . | |
| git commit -m "Update: description" | |
| git push | |
| ``` | |
| ### Updating on Encore Cloud | |
| ```bash | |
| # Make changes | |
| git commit -am "Update: description" | |
| encore deploy | |
| ``` | |
| ### Updating Docker | |
| ```bash | |
| # Rebuild image | |
| docker build -t llm-api-backend . | |
| # Stop old container | |
| docker stop llm-api | |
| docker rm llm-api | |
| # Run new container | |
| docker run -d [your flags] llm-api-backend | |
| ``` | |
| --- | |
| ## π Additional Resources | |
| - [Encore Documentation](https://encore.dev/docs) | |
| - [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces) | |
| - [Docker Documentation](https://docs.docker.com) | |
| - [Ollama Documentation](https://ollama.ai/docs) | |
| --- | |
| **Need help?** Open an issue or check the main README.md for support options. | |