File size: 4,868 Bytes
86042ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
# πŸ€– LLM API Backend - Hugging Face Spaces

A production-ready REST API for LLM capabilities including chat, RAG, and text analysis.

## πŸš€ Quick Deploy to Hugging Face Spaces

### Option 1: Using Hugging Face Spaces (Recommended)

1. **Create a new Space**
   - Go to [Hugging Face Spaces](https://huggingface.co/spaces)
   - Click "Create new Space"
   - Choose **Docker** as the SDK
   - Set visibility (Public or Private)

2. **Clone and push this repo**
   ```bash
   git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
   cd YOUR_SPACE_NAME
   # Copy all files from this project
   git add .
   git commit -m "Initial commit"
   git push
   ```

3. **Configure Secrets**
   - Go to your Space settings β†’ Repository secrets
   - Add these secrets:
     ```
     LLMProvider=huggingface
     HuggingFaceAPIKey=hf_your_token_here
     DefaultModel=mistralai/Mistral-7B-Instruct-v0.2
     ```

4. **Your API is live!**
   - Access at: `https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space`

### Option 2: Deploy Existing Encore App

Since this is already an Encore app, you can also:

```bash
# Deploy to Encore Cloud
encore deploy

# Then use the Encore API URL
https://proj_d3ggdgs82vjo5u1sek0g.api.lp.dev
```

## πŸ“‘ API Endpoints

All endpoints are available at your Space URL:

### Chat
```bash
curl -X POST https://YOUR_SPACE.hf.space/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Explain quantum computing"}'
```

### RAG (Retrieval-Augmented Generation)
```bash
curl -X POST https://YOUR_SPACE.hf.space/rag \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the main topic?",
    "context": [
      "Quantum computing uses quantum bits or qubits.",
      "Classical computers use binary bits."
    ]
  }'
```

### Text Analysis
```bash
curl -X POST https://YOUR_SPACE.hf.space/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your long text here...",
    "task": "summarize"
  }'
```

**Available tasks:** `summarize`, `evaluate`, `explain`, `extract`

### List Models
```bash
curl https://YOUR_SPACE.hf.space/models
```

### Health Check
```bash
curl https://YOUR_SPACE.hf.space/health
```

## πŸ”§ Configuration

### Environment Variables / Secrets

Required secrets in Hugging Face Spaces:

| Secret | Description | Example |
|--------|-------------|---------|
| `LLMProvider` | Provider to use | `huggingface` or `ollama` |
| `HuggingFaceAPIKey` | Your HF token | `hf_xxxxxxxxxxxxx` |
| `DefaultModel` | Default model | `mistralai/Mistral-7B-Instruct-v0.2` |
| `OllamaBaseURL` | Only if using Ollama | `http://localhost:11434` |

### Recommended Models for HF Spaces

- `mistralai/Mistral-7B-Instruct-v0.2` (Fast, efficient)
- `microsoft/phi-3-mini-4k-instruct` (Compact)
- `meta-llama/Meta-Llama-3-8B-Instruct` (High quality)
- `google/gemma-7b-it` (Versatile)

## πŸ—οΈ Architecture

```
backend/
β”œβ”€β”€ chat/          # Chat endpoint
β”œβ”€β”€ rag/           # RAG endpoint
β”œβ”€β”€ analyze/       # Text analysis
β”œβ”€β”€ models/        # Model listing
β”œβ”€β”€ health/        # Health check
└── lib/
    β”œβ”€β”€ llm-provider.ts      # Provider abstraction
    β”œβ”€β”€ ollama-client.ts     # Ollama integration
    β”œβ”€β”€ huggingface-client.ts # HF integration
    β”œβ”€β”€ cache.ts             # In-memory caching
    └── types.ts             # TypeScript types
```

## 🎯 Features

βœ… **Dual Provider Support** - Ollama (local) or Hugging Face (cloud)  
βœ… **Smart Caching** - In-memory cache with TTL  
βœ… **Type-Safe** - Full TypeScript support  
βœ… **Production Ready** - Error handling, logging, monitoring  
βœ… **RESTful API** - Clean, consistent endpoints  
βœ… **Zero Config** - Works out of the box on HF Spaces  

## πŸ” Security

- API keys stored as repository secrets
- No secrets in code or logs
- Rate limiting ready (can add middleware)
- CORS configured

## πŸ“Š Monitoring

Check API health:
```bash
curl https://YOUR_SPACE.hf.space/health
```

Returns:
```json
{
  "status": "healthy",
  "uptime": 3600,
  "provider": "huggingface",
  "modelsAvailable": true,
  "cache": {
    "chat": {"size": 10, "maxEntries": 100, "ttl": 300},
    "rag": {"size": 5, "maxEntries": 50, "ttl": 600},
    "analysis": {"size": 2, "maxEntries": 30, "ttl": 900}
  }
}
```

## πŸ†˜ Troubleshooting

### "Model loading" errors
- Wait 30-60 seconds for HF models to load
- Check your HF token has access to the model

### "Secret not set" errors
- Verify all secrets are configured in Space settings
- Restart the Space after adding secrets

### API not responding
- Check Space logs in the Hugging Face interface
- Verify Docker build completed successfully

## πŸ“ License

MIT License - feel free to use in your projects!

---

**Built with** [Encore.ts](https://encore.dev) | **Powered by** [Hugging Face](https://huggingface.co)