Spaces:
Runtime error
π€ LLM API Backend - Hugging Face Spaces
A production-ready REST API for LLM capabilities including chat, RAG, and text analysis.
π Quick Deploy to Hugging Face Spaces
Option 1: Using Hugging Face Spaces (Recommended)
Create a new Space
- Go to Hugging Face Spaces
- Click "Create new Space"
- Choose Docker as the SDK
- Set visibility (Public or Private)
Clone and push this repo
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAME # Copy all files from this project git add . git commit -m "Initial commit" git pushConfigure Secrets
- Go to your Space settings β Repository secrets
- Add these secrets:
LLMProvider=huggingface HuggingFaceAPIKey=hf_your_token_here DefaultModel=mistralai/Mistral-7B-Instruct-v0.2
Your API is live!
- Access at:
https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space
- Access at:
Option 2: Deploy Existing Encore App
Since this is already an Encore app, you can also:
# Deploy to Encore Cloud
encore deploy
# Then use the Encore API URL
https://proj_d3ggdgs82vjo5u1sek0g.api.lp.dev
π‘ API Endpoints
All endpoints are available at your Space URL:
Chat
curl -X POST https://YOUR_SPACE.hf.space/chat \
-H "Content-Type: application/json" \
-d '{"message": "Explain quantum computing"}'
RAG (Retrieval-Augmented Generation)
curl -X POST https://YOUR_SPACE.hf.space/rag \
-H "Content-Type: application/json" \
-d '{
"query": "What is the main topic?",
"context": [
"Quantum computing uses quantum bits or qubits.",
"Classical computers use binary bits."
]
}'
Text Analysis
curl -X POST https://YOUR_SPACE.hf.space/analyze \
-H "Content-Type: application/json" \
-d '{
"text": "Your long text here...",
"task": "summarize"
}'
Available tasks: summarize, evaluate, explain, extract
List Models
curl https://YOUR_SPACE.hf.space/models
Health Check
curl https://YOUR_SPACE.hf.space/health
π§ Configuration
Environment Variables / Secrets
Required secrets in Hugging Face Spaces:
| Secret | Description | Example |
|---|---|---|
LLMProvider |
Provider to use | huggingface or ollama |
HuggingFaceAPIKey |
Your HF token | hf_xxxxxxxxxxxxx |
DefaultModel |
Default model | mistralai/Mistral-7B-Instruct-v0.2 |
OllamaBaseURL |
Only if using Ollama | http://localhost:11434 |
Recommended Models for HF Spaces
mistralai/Mistral-7B-Instruct-v0.2(Fast, efficient)microsoft/phi-3-mini-4k-instruct(Compact)meta-llama/Meta-Llama-3-8B-Instruct(High quality)google/gemma-7b-it(Versatile)
ποΈ Architecture
backend/
βββ chat/ # Chat endpoint
βββ rag/ # RAG endpoint
βββ analyze/ # Text analysis
βββ models/ # Model listing
βββ health/ # Health check
βββ lib/
βββ llm-provider.ts # Provider abstraction
βββ ollama-client.ts # Ollama integration
βββ huggingface-client.ts # HF integration
βββ cache.ts # In-memory caching
βββ types.ts # TypeScript types
π― Features
β
Dual Provider Support - Ollama (local) or Hugging Face (cloud)
β
Smart Caching - In-memory cache with TTL
β
Type-Safe - Full TypeScript support
β
Production Ready - Error handling, logging, monitoring
β
RESTful API - Clean, consistent endpoints
β
Zero Config - Works out of the box on HF Spaces
π Security
- API keys stored as repository secrets
- No secrets in code or logs
- Rate limiting ready (can add middleware)
- CORS configured
π Monitoring
Check API health:
curl https://YOUR_SPACE.hf.space/health
Returns:
{
"status": "healthy",
"uptime": 3600,
"provider": "huggingface",
"modelsAvailable": true,
"cache": {
"chat": {"size": 10, "maxEntries": 100, "ttl": 300},
"rag": {"size": 5, "maxEntries": 50, "ttl": 600},
"analysis": {"size": 2, "maxEntries": 30, "ttl": 900}
}
}
π Troubleshooting
"Model loading" errors
- Wait 30-60 seconds for HF models to load
- Check your HF token has access to the model
"Secret not set" errors
- Verify all secrets are configured in Space settings
- Restart the Space after adding secrets
API not responding
- Check Space logs in the Hugging Face interface
- Verify Docker build completed successfully
π License
MIT License - feel free to use in your projects!
Built with Encore.ts | Powered by Hugging Face