# 🚀 Deployment Guide This guide covers deploying the LLM API Backend to various platforms. ## 📋 Prerequisites - Node.js 18+ installed - Encore CLI installed (`npm install -g encore.dev`) - Git installed - API keys for your chosen LLM provider --- ## 🌟 Hugging Face Spaces (Recommended for Demos) ### Step 1: Create a Space 1. Go to https://huggingface.co/spaces 2. Click **"Create new Space"** 3. Settings: - **Space name:** `llm-api-backend` (or your choice) - **SDK:** Docker - **Visibility:** Public or Private 4. Click **Create Space** ### Step 2: Clone and Push ```bash # Clone your new Space git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAME # Copy all files from this project cp -r /path/to/llm-api-backend/* . # Commit and push git add . git commit -m "Initial deployment" git push ``` ### Step 3: Configure Secrets 1. Go to your Space page 2. Click **Settings** → **Repository secrets** 3. Add the following secrets: **For Hugging Face Provider:** ``` LLMProvider = huggingface HuggingFaceAPIKey = hf_your_token_here DefaultModel = mistralai/Mistral-7B-Instruct-v0.2 ``` **For Ollama Provider (requires custom Docker setup):** ``` LLMProvider = ollama OllamaBaseURL = http://localhost:11434 DefaultModel = llama3 ``` ### Step 4: Wait for Build - Hugging Face will automatically build your Docker container - Watch the build logs in the Space interface - Once complete, your API is live! ### Step 5: Test Your API ```bash # Replace with your actual Space URL export SPACE_URL="https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space" # Test chat endpoint curl -X POST $SPACE_URL/chat \ -H "Content-Type: application/json" \ -d '{"message": "Hello!"}' # Test health endpoint curl $SPACE_URL/health ``` --- ## ☁️ Encore Cloud (Recommended for Production) ### Step 1: Install Encore ```bash npm install -g encore.dev ``` ### Step 2: Create Encore App ```bash # If starting fresh encore app create # Or link existing app encore app link ``` ### Step 3: Set Secrets ```bash # For Hugging Face encore secret set LLMProvider huggingface encore secret set HuggingFaceAPIKey hf_your_token_here encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2 # For Ollama (local development) encore secret set LLMProvider ollama encore secret set OllamaBaseURL http://localhost:11434 encore secret set DefaultModel llama3 ``` ### Step 4: Deploy ```bash # Deploy to staging encore deploy # Deploy to production encore deploy --env production ``` ### Step 5: Access Your API ```bash # Your API will be available at: # Staging: https://staging-YOUR_APP.encr.app # Production: https://prod-YOUR_APP.encr.app # Test it curl https://staging-YOUR_APP.encr.app/health ``` --- ## 🐳 Docker (Self-Hosted) ### Step 1: Build Image ```bash docker build -t llm-api-backend . ``` ### Step 2: Run Container **Using Hugging Face:** ```bash docker run -d \ -p 7860:7860 \ -e LLMProvider=huggingface \ -e HuggingFaceAPIKey=hf_your_token \ -e DefaultModel=mistralai/Mistral-7B-Instruct-v0.2 \ --name llm-api \ llm-api-backend ``` **Using Ollama (with host network):** ```bash docker run -d \ --network host \ -e LLMProvider=ollama \ -e OllamaBaseURL=http://localhost:11434 \ -e DefaultModel=llama3 \ --name llm-api \ llm-api-backend ``` ### Step 3: Test ```bash curl http://localhost:7860/health ``` --- ## 🖥️ VPS / Bare Metal ### Step 1: Install Dependencies ```bash # Install Node.js 20 curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash - sudo apt-get install -y nodejs # Install Encore CLI npm install -g encore.dev ``` ### Step 2: Clone Repository ```bash git clone https://github.com/YOUR_USERNAME/llm-api-backend.git cd llm-api-backend ``` ### Step 3: Configure Environment ```bash # Copy example env file cp .env.example .env # Edit with your values nano .env ``` ### Step 4: Set Encore Secrets ```bash encore secret set LLMProvider huggingface encore secret set HuggingFaceAPIKey hf_your_token encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2 ``` ### Step 5: Run with PM2 (Production) ```bash # Install PM2 npm install -g pm2 # Start application pm2 start "encore run --port 8080" --name llm-api # Save PM2 configuration pm2 save # Enable startup on boot pm2 startup ``` ### Step 6: Configure Nginx (Optional) ```nginx server { listen 80; server_name your-domain.com; location / { proxy_pass http://localhost:8080; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; } } ``` --- ## 🔐 Security Checklist Before deploying to production: - [ ] All secrets configured properly - [ ] API keys have appropriate permissions - [ ] CORS configured for your frontend domains - [ ] Rate limiting enabled (add middleware) - [ ] HTTPS enabled (Encore/HF Spaces handle this) - [ ] Environment variables not committed to git - [ ] Monitoring and logging set up - [ ] Error tracking configured --- ## 📊 Monitoring ### Encore Cloud - Built-in dashboard at https://app.encore.dev - Real-time traces, logs, and metrics - Performance monitoring - Error tracking ### Hugging Face Spaces - View logs in Space interface - Use `/health` endpoint for uptime monitoring - Configure external monitoring tools ### Self-Hosted - Use `/health` endpoint - Set up monitoring tools like: - Prometheus + Grafana - Datadog - New Relic - Sentry for errors --- ## 🆘 Troubleshooting ### Build Failures on HF Spaces **Issue:** Docker build fails ```bash # Check Dockerfile syntax # Ensure all required files are committed # Check Space build logs ``` ### "Secret not set" Errors **Issue:** Application can't access secrets ```bash # On Encore: Use 'encore secret set' command # On HF Spaces: Configure in Space settings # On Docker: Pass as environment variables (-e flag) ``` ### Model Loading Timeout **Issue:** HF models take too long to load ```bash # Solution: Wait 30-60 seconds for cold start # Use smaller models for faster loading # Check model availability on HF ``` ### Connection Refused (Ollama) **Issue:** Can't connect to Ollama ```bash # Ensure Ollama is running: ollama serve # Check OllamaBaseURL is correct # For Docker: Use --network host ``` --- ## 🔄 Updates and Maintenance ### Updating on Hugging Face Spaces ```bash git pull origin main # Make your changes git add . git commit -m "Update: description" git push ``` ### Updating on Encore Cloud ```bash # Make changes git commit -am "Update: description" encore deploy ``` ### Updating Docker ```bash # Rebuild image docker build -t llm-api-backend . # Stop old container docker stop llm-api docker rm llm-api # Run new container docker run -d [your flags] llm-api-backend ``` --- ## 📚 Additional Resources - [Encore Documentation](https://encore.dev/docs) - [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces) - [Docker Documentation](https://docs.docker.com) - [Ollama Documentation](https://ollama.ai/docs) --- **Need help?** Open an issue or check the main README.md for support options.