Spaces:
Runtime error
Runtime error
| # π€ LLM API Backend - Hugging Face Spaces | |
| A production-ready REST API for LLM capabilities including chat, RAG, and text analysis. | |
| ## π Quick Deploy to Hugging Face Spaces | |
| ### Option 1: Using Hugging Face Spaces (Recommended) | |
| 1. **Create a new Space** | |
| - Go to [Hugging Face Spaces](https://huggingface.co/spaces) | |
| - Click "Create new Space" | |
| - Choose **Docker** as the SDK | |
| - Set visibility (Public or Private) | |
| 2. **Clone and push this repo** | |
| ```bash | |
| git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME | |
| cd YOUR_SPACE_NAME | |
| # Copy all files from this project | |
| git add . | |
| git commit -m "Initial commit" | |
| git push | |
| ``` | |
| 3. **Configure Secrets** | |
| - Go to your Space settings β Repository secrets | |
| - Add these secrets: | |
| ``` | |
| LLMProvider=huggingface | |
| HuggingFaceAPIKey=hf_your_token_here | |
| DefaultModel=mistralai/Mistral-7B-Instruct-v0.2 | |
| ``` | |
| 4. **Your API is live!** | |
| - Access at: `https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space` | |
| ### Option 2: Deploy Existing Encore App | |
| Since this is already an Encore app, you can also: | |
| ```bash | |
| # Deploy to Encore Cloud | |
| encore deploy | |
| # Then use the Encore API URL | |
| https://proj_d3ggdgs82vjo5u1sek0g.api.lp.dev | |
| ``` | |
| ## π‘ API Endpoints | |
| All endpoints are available at your Space URL: | |
| ### Chat | |
| ```bash | |
| curl -X POST https://YOUR_SPACE.hf.space/chat \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"message": "Explain quantum computing"}' | |
| ``` | |
| ### RAG (Retrieval-Augmented Generation) | |
| ```bash | |
| curl -X POST https://YOUR_SPACE.hf.space/rag \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "query": "What is the main topic?", | |
| "context": [ | |
| "Quantum computing uses quantum bits or qubits.", | |
| "Classical computers use binary bits." | |
| ] | |
| }' | |
| ``` | |
| ### Text Analysis | |
| ```bash | |
| curl -X POST https://YOUR_SPACE.hf.space/analyze \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "text": "Your long text here...", | |
| "task": "summarize" | |
| }' | |
| ``` | |
| **Available tasks:** `summarize`, `evaluate`, `explain`, `extract` | |
| ### List Models | |
| ```bash | |
| curl https://YOUR_SPACE.hf.space/models | |
| ``` | |
| ### Health Check | |
| ```bash | |
| curl https://YOUR_SPACE.hf.space/health | |
| ``` | |
| ## π§ Configuration | |
| ### Environment Variables / Secrets | |
| Required secrets in Hugging Face Spaces: | |
| | Secret | Description | Example | | |
| |--------|-------------|---------| | |
| | `LLMProvider` | Provider to use | `huggingface` or `ollama` | | |
| | `HuggingFaceAPIKey` | Your HF token | `hf_xxxxxxxxxxxxx` | | |
| | `DefaultModel` | Default model | `mistralai/Mistral-7B-Instruct-v0.2` | | |
| | `OllamaBaseURL` | Only if using Ollama | `http://localhost:11434` | | |
| ### Recommended Models for HF Spaces | |
| - `mistralai/Mistral-7B-Instruct-v0.2` (Fast, efficient) | |
| - `microsoft/phi-3-mini-4k-instruct` (Compact) | |
| - `meta-llama/Meta-Llama-3-8B-Instruct` (High quality) | |
| - `google/gemma-7b-it` (Versatile) | |
| ## ποΈ Architecture | |
| ``` | |
| backend/ | |
| βββ chat/ # Chat endpoint | |
| βββ rag/ # RAG endpoint | |
| βββ analyze/ # Text analysis | |
| βββ models/ # Model listing | |
| βββ health/ # Health check | |
| βββ lib/ | |
| βββ llm-provider.ts # Provider abstraction | |
| βββ ollama-client.ts # Ollama integration | |
| βββ huggingface-client.ts # HF integration | |
| βββ cache.ts # In-memory caching | |
| βββ types.ts # TypeScript types | |
| ``` | |
| ## π― Features | |
| β **Dual Provider Support** - Ollama (local) or Hugging Face (cloud) | |
| β **Smart Caching** - In-memory cache with TTL | |
| β **Type-Safe** - Full TypeScript support | |
| β **Production Ready** - Error handling, logging, monitoring | |
| β **RESTful API** - Clean, consistent endpoints | |
| β **Zero Config** - Works out of the box on HF Spaces | |
| ## π Security | |
| - API keys stored as repository secrets | |
| - No secrets in code or logs | |
| - Rate limiting ready (can add middleware) | |
| - CORS configured | |
| ## π Monitoring | |
| Check API health: | |
| ```bash | |
| curl https://YOUR_SPACE.hf.space/health | |
| ``` | |
| Returns: | |
| ```json | |
| { | |
| "status": "healthy", | |
| "uptime": 3600, | |
| "provider": "huggingface", | |
| "modelsAvailable": true, | |
| "cache": { | |
| "chat": {"size": 10, "maxEntries": 100, "ttl": 300}, | |
| "rag": {"size": 5, "maxEntries": 50, "ttl": 600}, | |
| "analysis": {"size": 2, "maxEntries": 30, "ttl": 900} | |
| } | |
| } | |
| ``` | |
| ## π Troubleshooting | |
| ### "Model loading" errors | |
| - Wait 30-60 seconds for HF models to load | |
| - Check your HF token has access to the model | |
| ### "Secret not set" errors | |
| - Verify all secrets are configured in Space settings | |
| - Restart the Space after adding secrets | |
| ### API not responding | |
| - Check Space logs in the Hugging Face interface | |
| - Verify Docker build completed successfully | |
| ## π License | |
| MIT License - feel free to use in your projects! | |
| --- | |
| **Built with** [Encore.ts](https://encore.dev) | **Powered by** [Hugging Face](https://huggingface.co) | |