Spaces:

Kraft102
/

widgettdc-api

Paused

App Files Files Community

widgettdc-api / docs /GPU_DEPLOYMENT.md

Kraft102

fix: sql.js Docker/Alpine compatibility layer for PatternMemory and FailureMemory

5a81b95 2 months ago

preview code

raw

history blame contribute delete

5.43 kB

🚀 GPU Deployment Guide - Hugging Face Spaces

Overview

Deploy WidgeTDC backend to Hugging Face Spaces with FREE GPU (NVIDIA T4 16GB).

📋 Prerequisites

Hugging Face Account
- Sign up at: https://huggingface.co/join
- Free tier includes GPU access!
GitHub Repository Secrets
- Go to: Settings → Secrets and variables → Actions
- Add the following secrets

🔐 Step 1: Get Hugging Face Token

Go to: https://huggingface.co/settings/tokens
Click "New token"
Name: GitHub Actions Deploy
Type: Write access
Copy the token

🏗️ Step 2: Create Hugging Face Space

Go to: https://huggingface.co/new-space
Fill in:
- Owner: Your username
- Space name: widgetdc (or your choice)
- License: Apache 2.0
- SDK: Docker
- Hardware: T4 small (GPU)
- Visibility: Private (or Public)
Click "Create Space"

🔑 Step 3: Add GitHub Secrets

Go to your GitHub repo → Settings → Secrets → Actions:

Add Secret 1: `HF_TOKEN`

Value: <paste your Hugging Face token from Step 1>

Add Secret 2: `HF_SPACE_NAME`

Value: YOUR_USERNAME/widgetdc
Example: clauskraft/widgetdc

Optional Secrets for Production:

GEMINI_API_KEY=<your Gemini API key>
NEO4J_URI=<your Neo4j connection string>
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your password>
POSTGRES_HOST=<your postgres host>
DATABASE_URL=<your postgres connection string>

🚀 Step 4: Deploy!

Automatic Deploy (on every push to main):

git push origin main

Manual Deploy:

Go to GitHub → Actions tab
Select "Deploy to Hugging Face (GPU)"
Click "Run workflow"
Select branch: main
Click "Run workflow"

📊 Step 5: Monitor Deployment

Check GitHub Actions:
- https://github.com/YOUR_USERNAME/WidgeTDC/actions
Check Hugging Face Logs:
- Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/widgetdc
- Click "Logs" tab
- Watch real-time build progress
Access Your App:
- URL: https://YOUR_USERNAME-widgetdc.hf.space
- API: https://YOUR_USERNAME-widgetdc.hf.space/api

🎯 GPU Benefits

What You Get:

✅ NVIDIA T4 GPU (16GB VRAM)
✅ CUDA 12.2 enabled
✅ PyTorch pre-installed
✅ Sentence Transformers for embeddings
✅ 10x faster AI inference
✅ FREE on Hugging Face Community

What Runs on GPU:

Vector Embeddings - Sentence transformers
Knowledge Graph Embeddings - Node2Vec, GraphSAGE
LLM Inference - Gemini/local models
Semantic Search - FAISS/pgvector with GPU
Entity Recognition - NER models

🔧 Configuration

Environment Variables in HF Space:

Go to Space → Settings → Variables:

NODE_ENV=production
PORT=7860
USE_GPU=true
GEMINI_API_KEY=<your-key>
NEO4J_URI=<neo4j-uri>
DATABASE_URL=<postgres-url>

GPU Settings in Space:

Edit `README.md` in your Space: ```yaml

title: WidgeTDC Neural Platform sdk: docker hardware: t4-small # Options: cpu-basic, t4-small, t4-medium, a10g-small


**Hardware Options:**
- `cpu-basic` - Free, no GPU
- `t4-small` - Free GPU, NVIDIA T4, 16GB
- `t4-medium` - Paid, 2x T4
- `a10g-small` - Paid, NVIDIA A10G, 24GB

---

## 🧪 Test GPU Deployment

### 1. Check GPU Availability:
```bash
curl https://YOUR_USERNAME-widgetdc.hf.space/health

2. Test Embedding Generation:

curl -X POST https://YOUR_USERNAME-widgetdc.hf.space/api/srag/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What is AI?", "limit": 5}'

3. Monitor GPU Usage:

Check HF Space logs for:

✅ GPU Available: NVIDIA T4
✅ CUDA Version: 12.2
✅ PyTorch GPU: True

🔄 Update Deployment

To update your deployed app:

Make changes locally
Commit and push:

git add .
git commit -m "feat: your changes"
git push origin main

GitHub Actions auto-deploys to HF Spaces
Watch logs in Actions tab

🐛 Troubleshooting

Issue: Build Fails

Solution: Check GitHub Actions logs for errors

Issue: GPU Not Detected

Solution: Verify hardware: t4-small in Space README.md

Issue: Out of Memory

Solution:

Reduce batch size in embeddings
Use --max-old-space-size=4096 flag
Upgrade to t4-medium

Issue: Slow Startup