widgettdc-api / docs /GPU_DEPLOYMENT.md
Kraft102's picture
fix: sql.js Docker/Alpine compatibility layer for PatternMemory and FailureMemory
5a81b95

πŸš€ GPU Deployment Guide - Hugging Face Spaces

Overview

Deploy WidgeTDC backend to Hugging Face Spaces with FREE GPU (NVIDIA T4 16GB).


πŸ“‹ Prerequisites

  1. Hugging Face Account

  2. GitHub Repository Secrets

    • Go to: Settings β†’ Secrets and variables β†’ Actions
    • Add the following secrets

πŸ” Step 1: Get Hugging Face Token

  1. Go to: https://huggingface.co/settings/tokens
  2. Click "New token"
  3. Name: GitHub Actions Deploy
  4. Type: Write access
  5. Copy the token

πŸ—οΈ Step 2: Create Hugging Face Space

  1. Go to: https://huggingface.co/new-space
  2. Fill in:
    • Owner: Your username
    • Space name: widgetdc (or your choice)
    • License: Apache 2.0
    • SDK: Docker
    • Hardware: T4 small (GPU)
    • Visibility: Private (or Public)
  3. Click "Create Space"

πŸ”‘ Step 3: Add GitHub Secrets

Go to your GitHub repo β†’ Settings β†’ Secrets β†’ Actions:

Add Secret 1: HF_TOKEN

Value: <paste your Hugging Face token from Step 1>

Add Secret 2: HF_SPACE_NAME

Value: YOUR_USERNAME/widgetdc
Example: clauskraft/widgetdc

Optional Secrets for Production:

GEMINI_API_KEY=<your Gemini API key>
NEO4J_URI=<your Neo4j connection string>
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your password>
POSTGRES_HOST=<your postgres host>
DATABASE_URL=<your postgres connection string>

πŸš€ Step 4: Deploy!

Automatic Deploy (on every push to main):

git push origin main

Manual Deploy:

  1. Go to GitHub β†’ Actions tab
  2. Select "Deploy to Hugging Face (GPU)"
  3. Click "Run workflow"
  4. Select branch: main
  5. Click "Run workflow"

πŸ“Š Step 5: Monitor Deployment

  1. Check GitHub Actions:

  2. Check Hugging Face Logs:

  3. Access Your App:

    • URL: https://YOUR_USERNAME-widgetdc.hf.space
    • API: https://YOUR_USERNAME-widgetdc.hf.space/api

🎯 GPU Benefits

What You Get:

  • βœ… NVIDIA T4 GPU (16GB VRAM)
  • βœ… CUDA 12.2 enabled
  • βœ… PyTorch pre-installed
  • βœ… Sentence Transformers for embeddings
  • βœ… 10x faster AI inference
  • βœ… FREE on Hugging Face Community

What Runs on GPU:

  1. Vector Embeddings - Sentence transformers
  2. Knowledge Graph Embeddings - Node2Vec, GraphSAGE
  3. LLM Inference - Gemini/local models
  4. Semantic Search - FAISS/pgvector with GPU
  5. Entity Recognition - NER models

πŸ”§ Configuration

Environment Variables in HF Space:

Go to Space β†’ Settings β†’ Variables:

NODE_ENV=production
PORT=7860
USE_GPU=true
GEMINI_API_KEY=<your-key>
NEO4J_URI=<neo4j-uri>
DATABASE_URL=<postgres-url>

GPU Settings in Space:

Edit README.md in your Space: ```yaml

title: WidgeTDC Neural Platform sdk: docker hardware: t4-small # Options: cpu-basic, t4-small, t4-medium, a10g-small


**Hardware Options:**
- `cpu-basic` - Free, no GPU
- `t4-small` - Free GPU, NVIDIA T4, 16GB
- `t4-medium` - Paid, 2x T4
- `a10g-small` - Paid, NVIDIA A10G, 24GB

---

## πŸ§ͺ Test GPU Deployment

### 1. Check GPU Availability:
```bash
curl https://YOUR_USERNAME-widgetdc.hf.space/health

2. Test Embedding Generation:

curl -X POST https://YOUR_USERNAME-widgetdc.hf.space/api/srag/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What is AI?", "limit": 5}'

3. Monitor GPU Usage:

Check HF Space logs for:

βœ… GPU Available: NVIDIA T4
βœ… CUDA Version: 12.2
βœ… PyTorch GPU: True

πŸ”„ Update Deployment

To update your deployed app:

  1. Make changes locally
  2. Commit and push:
git add .
git commit -m "feat: your changes"
git push origin main
  1. GitHub Actions auto-deploys to HF Spaces
  2. Watch logs in Actions tab

πŸ› Troubleshooting

Issue: Build Fails

Solution: Check GitHub Actions logs for errors

Issue: GPU Not Detected

Solution: Verify hardware: t4-small in Space README.md

Issue: Out of Memory

Solution:

  • Reduce batch size in embeddings
  • Use --max-old-space-size=4096 flag
  • Upgrade to t4-medium

Issue: Slow Startup

Solution:

  • Normal! GPU containers take 2-3 minutes to boot
  • Check "Logs" tab for progress

πŸ“ˆ Alternative GPU Platforms

If you need more GPU power:

Modal Labs (Serverless GPU)

  • A100 GPUs (40GB/80GB)
  • Pay per second
  • Easy Python/Node.js deployment

Railway (GPU Add-on)

  • NVIDIA A10G (24GB)
  • $10-50/month
  • Better for production

Runpod (Cheap GPU)

  • A40/A100 available
  • $0.39/hr for A40
  • Full Docker support

βœ… Success Checklist

  • Hugging Face account created
  • Space created with GPU hardware
  • GitHub secrets added (HF_TOKEN, HF_SPACE_NAME)
  • Workflow file committed
  • First deployment triggered
  • App accessible at HF Space URL
  • GPU detected in logs
  • API endpoints responding

πŸŽ‰ You're Done!

Your WidgeTDC platform now runs on FREE GPU infrastructure! πŸš€

Next Steps:

  • Monitor performance in HF Spaces
  • Add more AI models
  • Scale to paid tier if needed
  • Enjoy 10x faster AI inference!