Spaces:
Runtime error
Runtime error
File size: 4,868 Bytes
86042ad |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
# π€ LLM API Backend - Hugging Face Spaces
A production-ready REST API for LLM capabilities including chat, RAG, and text analysis.
## π Quick Deploy to Hugging Face Spaces
### Option 1: Using Hugging Face Spaces (Recommended)
1. **Create a new Space**
- Go to [Hugging Face Spaces](https://huggingface.co/spaces)
- Click "Create new Space"
- Choose **Docker** as the SDK
- Set visibility (Public or Private)
2. **Clone and push this repo**
```bash
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME
# Copy all files from this project
git add .
git commit -m "Initial commit"
git push
```
3. **Configure Secrets**
- Go to your Space settings β Repository secrets
- Add these secrets:
```
LLMProvider=huggingface
HuggingFaceAPIKey=hf_your_token_here
DefaultModel=mistralai/Mistral-7B-Instruct-v0.2
```
4. **Your API is live!**
- Access at: `https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space`
### Option 2: Deploy Existing Encore App
Since this is already an Encore app, you can also:
```bash
# Deploy to Encore Cloud
encore deploy
# Then use the Encore API URL
https://proj_d3ggdgs82vjo5u1sek0g.api.lp.dev
```
## π‘ API Endpoints
All endpoints are available at your Space URL:
### Chat
```bash
curl -X POST https://YOUR_SPACE.hf.space/chat \
-H "Content-Type: application/json" \
-d '{"message": "Explain quantum computing"}'
```
### RAG (Retrieval-Augmented Generation)
```bash
curl -X POST https://YOUR_SPACE.hf.space/rag \
-H "Content-Type: application/json" \
-d '{
"query": "What is the main topic?",
"context": [
"Quantum computing uses quantum bits or qubits.",
"Classical computers use binary bits."
]
}'
```
### Text Analysis
```bash
curl -X POST https://YOUR_SPACE.hf.space/analyze \
-H "Content-Type: application/json" \
-d '{
"text": "Your long text here...",
"task": "summarize"
}'
```
**Available tasks:** `summarize`, `evaluate`, `explain`, `extract`
### List Models
```bash
curl https://YOUR_SPACE.hf.space/models
```
### Health Check
```bash
curl https://YOUR_SPACE.hf.space/health
```
## π§ Configuration
### Environment Variables / Secrets
Required secrets in Hugging Face Spaces:
| Secret | Description | Example |
|--------|-------------|---------|
| `LLMProvider` | Provider to use | `huggingface` or `ollama` |
| `HuggingFaceAPIKey` | Your HF token | `hf_xxxxxxxxxxxxx` |
| `DefaultModel` | Default model | `mistralai/Mistral-7B-Instruct-v0.2` |
| `OllamaBaseURL` | Only if using Ollama | `http://localhost:11434` |
### Recommended Models for HF Spaces
- `mistralai/Mistral-7B-Instruct-v0.2` (Fast, efficient)
- `microsoft/phi-3-mini-4k-instruct` (Compact)
- `meta-llama/Meta-Llama-3-8B-Instruct` (High quality)
- `google/gemma-7b-it` (Versatile)
## ποΈ Architecture
```
backend/
βββ chat/ # Chat endpoint
βββ rag/ # RAG endpoint
βββ analyze/ # Text analysis
βββ models/ # Model listing
βββ health/ # Health check
βββ lib/
βββ llm-provider.ts # Provider abstraction
βββ ollama-client.ts # Ollama integration
βββ huggingface-client.ts # HF integration
βββ cache.ts # In-memory caching
βββ types.ts # TypeScript types
```
## π― Features
β
**Dual Provider Support** - Ollama (local) or Hugging Face (cloud)
β
**Smart Caching** - In-memory cache with TTL
β
**Type-Safe** - Full TypeScript support
β
**Production Ready** - Error handling, logging, monitoring
β
**RESTful API** - Clean, consistent endpoints
β
**Zero Config** - Works out of the box on HF Spaces
## π Security
- API keys stored as repository secrets
- No secrets in code or logs
- Rate limiting ready (can add middleware)
- CORS configured
## π Monitoring
Check API health:
```bash
curl https://YOUR_SPACE.hf.space/health
```
Returns:
```json
{
"status": "healthy",
"uptime": 3600,
"provider": "huggingface",
"modelsAvailable": true,
"cache": {
"chat": {"size": 10, "maxEntries": 100, "ttl": 300},
"rag": {"size": 5, "maxEntries": 50, "ttl": 600},
"analysis": {"size": 2, "maxEntries": 30, "ttl": 900}
}
}
```
## π Troubleshooting
### "Model loading" errors
- Wait 30-60 seconds for HF models to load
- Check your HF token has access to the model
### "Secret not set" errors
- Verify all secrets are configured in Space settings
- Restart the Space after adding secrets
### API not responding
- Check Space logs in the Hugging Face interface
- Verify Docker build completed successfully
## π License
MIT License - feel free to use in your projects!
---
**Built with** [Encore.ts](https://encore.dev) | **Powered by** [Hugging Face](https://huggingface.co)
|