Spaces:

cygon24
/

llm-api-backend

Runtime error

App Files Files Community

llm-api-backend / README.space.md

cygon

intial commit

86042ad 3 months ago

preview code

raw

history blame contribute delete

4.87 kB

🤖 LLM API Backend - Hugging Face Spaces

A production-ready REST API for LLM capabilities including chat, RAG, and text analysis.

🚀 Quick Deploy to Hugging Face Spaces

Option 1: Using Hugging Face Spaces (Recommended)

Create a new Space
- Go to Hugging Face Spaces
- Click "Create new Space"
- Choose Docker as the SDK
- Set visibility (Public or Private)

Clone and push this repo

git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME
# Copy all files from this project
git add .
git commit -m "Initial commit"
git push

Configure Secrets

Go to your Space settings → Repository secrets

Add these secrets:

LLMProvider=huggingface
HuggingFaceAPIKey=hf_your_token_here
DefaultModel=mistralai/Mistral-7B-Instruct-v0.2

Your API is live!
- Access at: https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space

Option 2: Deploy Existing Encore App

Since this is already an Encore app, you can also:

# Deploy to Encore Cloud
encore deploy

# Then use the Encore API URL
https://proj_d3ggdgs82vjo5u1sek0g.api.lp.dev

📡 API Endpoints

All endpoints are available at your Space URL:

Chat

curl -X POST https://YOUR_SPACE.hf.space/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Explain quantum computing"}'

RAG (Retrieval-Augmented Generation)

curl -X POST https://YOUR_SPACE.hf.space/rag \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the main topic?",
    "context": [
      "Quantum computing uses quantum bits or qubits.",
      "Classical computers use binary bits."
    ]
  }'

Text Analysis

curl -X POST https://YOUR_SPACE.hf.space/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your long text here...",
    "task": "summarize"
  }'

Available tasks: summarize, evaluate, explain, extract

List Models

curl https://YOUR_SPACE.hf.space/models

Health Check

curl https://YOUR_SPACE.hf.space/health

🔧 Configuration

Environment Variables / Secrets

Required secrets in Hugging Face Spaces:

Secret	Description	Example
`LLMProvider`	Provider to use	`huggingface` or `ollama`
`HuggingFaceAPIKey`	Your HF token	`hf_xxxxxxxxxxxxx`
`DefaultModel`	Default model	`mistralai/Mistral-7B-Instruct-v0.2`
`OllamaBaseURL`	Only if using Ollama	`http://localhost:11434`

Recommended Models for HF Spaces

mistralai/Mistral-7B-Instruct-v0.2 (Fast, efficient)
microsoft/phi-3-mini-4k-instruct (Compact)
meta-llama/Meta-Llama-3-8B-Instruct (High quality)
google/gemma-7b-it (Versatile)

🏗️ Architecture

backend/
├── chat/          # Chat endpoint
├── rag/           # RAG endpoint
├── analyze/       # Text analysis
├── models/        # Model listing
├── health/        # Health check
└── lib/
    ├── llm-provider.ts      # Provider abstraction
    ├── ollama-client.ts     # Ollama integration
    ├── huggingface-client.ts # HF integration
    ├── cache.ts             # In-memory caching
    └── types.ts             # TypeScript types

🎯 Features

✅ Dual Provider Support - Ollama (local) or Hugging Face (cloud)
✅ Smart Caching - In-memory cache with TTL
✅ Type-Safe - Full TypeScript support
✅ Production Ready - Error handling, logging, monitoring
✅ RESTful API - Clean, consistent endpoints
✅ Zero Config - Works out of the box on HF Spaces

🔐 Security

API keys stored as repository secrets
No secrets in code or logs
Rate limiting ready (can add middleware)
CORS configured

📊 Monitoring

Check API health:

curl https://YOUR_SPACE.hf.space/health

Returns:

{
  "status": "healthy",
  "uptime": 3600,
  "provider": "huggingface",
  "modelsAvailable": true,
  "cache": {
    "chat": {"size": 10, "maxEntries": 100, "ttl": 300},
    "rag": {"size": 5, "maxEntries": 50, "ttl": 600},
    "analysis": {"size": 2, "maxEntries": 30, "ttl": 900}
  }
}

🆘 Troubleshooting

"Model loading" errors

Wait 30-60 seconds for HF models to load
Check your HF token has access to the model

"Secret not set" errors

Verify all secrets are configured in Space settings
Restart the Space after adding secrets

API not responding

Check Space logs in the Hugging Face interface
Verify Docker build completed successfully

📝 License

MIT License - feel free to use in your projects!