widgettdc-api / docs /GPU_DEPLOYMENT.md
Kraft102's picture
fix: sql.js Docker/Alpine compatibility layer for PatternMemory and FailureMemory
5a81b95
# πŸš€ GPU Deployment Guide - Hugging Face Spaces
## Overview
Deploy WidgeTDC backend to Hugging Face Spaces with **FREE GPU** (NVIDIA T4 16GB).
---
## πŸ“‹ Prerequisites
1. **Hugging Face Account**
- Sign up at: https://huggingface.co/join
- Free tier includes GPU access!
2. **GitHub Repository Secrets**
- Go to: `Settings` β†’ `Secrets and variables` β†’ `Actions`
- Add the following secrets
---
## πŸ” Step 1: Get Hugging Face Token
1. Go to: https://huggingface.co/settings/tokens
2. Click **"New token"**
3. Name: `GitHub Actions Deploy`
4. Type: **Write** access
5. Copy the token
---
## πŸ—οΈ Step 2: Create Hugging Face Space
1. Go to: https://huggingface.co/new-space
2. Fill in:
- **Owner**: Your username
- **Space name**: `widgetdc` (or your choice)
- **License**: Apache 2.0
- **SDK**: Docker
- **Hardware**: **T4 small (GPU)**
- **Visibility**: Private (or Public)
3. Click **"Create Space"**
---
## πŸ”‘ Step 3: Add GitHub Secrets
Go to your GitHub repo β†’ Settings β†’ Secrets β†’ Actions:
### Add Secret 1: `HF_TOKEN`
```
Value: <paste your Hugging Face token from Step 1>
```
### Add Secret 2: `HF_SPACE_NAME`
```
Value: YOUR_USERNAME/widgetdc
Example: clauskraft/widgetdc
```
### Optional Secrets for Production:
```
GEMINI_API_KEY=<your Gemini API key>
NEO4J_URI=<your Neo4j connection string>
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your password>
POSTGRES_HOST=<your postgres host>
DATABASE_URL=<your postgres connection string>
```
---
## πŸš€ Step 4: Deploy!
### Automatic Deploy (on every push to main):
```bash
git push origin main
```
### Manual Deploy:
1. Go to GitHub β†’ Actions tab
2. Select **"Deploy to Hugging Face (GPU)"**
3. Click **"Run workflow"**
4. Select branch: `main`
5. Click **"Run workflow"**
---
## πŸ“Š Step 5: Monitor Deployment
1. **Check GitHub Actions**:
- https://github.com/YOUR_USERNAME/WidgeTDC/actions
2. **Check Hugging Face Logs**:
- Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/widgetdc
- Click **"Logs"** tab
- Watch real-time build progress
3. **Access Your App**:
- URL: `https://YOUR_USERNAME-widgetdc.hf.space`
- API: `https://YOUR_USERNAME-widgetdc.hf.space/api`
---
## 🎯 GPU Benefits
### What You Get:
- βœ… **NVIDIA T4 GPU** (16GB VRAM)
- βœ… **CUDA 12.2** enabled
- βœ… **PyTorch** pre-installed
- βœ… **Sentence Transformers** for embeddings
- βœ… **10x faster** AI inference
- βœ… **FREE** on Hugging Face Community
### What Runs on GPU:
1. **Vector Embeddings** - Sentence transformers
2. **Knowledge Graph Embeddings** - Node2Vec, GraphSAGE
3. **LLM Inference** - Gemini/local models
4. **Semantic Search** - FAISS/pgvector with GPU
5. **Entity Recognition** - NER models
---
## πŸ”§ Configuration
### Environment Variables in HF Space:
Go to Space β†’ Settings β†’ Variables:
```bash
NODE_ENV=production
PORT=7860
USE_GPU=true
GEMINI_API_KEY=<your-key>
NEO4J_URI=<neo4j-uri>
DATABASE_URL=<postgres-url>
```
### GPU Settings in Space:
Edit `README.md` in your Space:
```yaml
---
title: WidgeTDC Neural Platform
sdk: docker
hardware: t4-small # Options: cpu-basic, t4-small, t4-medium, a10g-small
---
```
**Hardware Options:**
- `cpu-basic` - Free, no GPU
- `t4-small` - Free GPU, NVIDIA T4, 16GB
- `t4-medium` - Paid, 2x T4
- `a10g-small` - Paid, NVIDIA A10G, 24GB
---
## πŸ§ͺ Test GPU Deployment
### 1. Check GPU Availability:
```bash
curl https://YOUR_USERNAME-widgetdc.hf.space/health
```
### 2. Test Embedding Generation:
```bash
curl -X POST https://YOUR_USERNAME-widgetdc.hf.space/api/srag/query \
-H "Content-Type: application/json" \
-d '{"query": "What is AI?", "limit": 5}'
```
### 3. Monitor GPU Usage:
Check HF Space logs for:
```
βœ… GPU Available: NVIDIA T4
βœ… CUDA Version: 12.2
βœ… PyTorch GPU: True
```
---
## πŸ”„ Update Deployment
To update your deployed app:
1. Make changes locally
2. Commit and push:
```bash
git add .
git commit -m "feat: your changes"
git push origin main
```
3. GitHub Actions auto-deploys to HF Spaces
4. Watch logs in Actions tab
---
## πŸ› Troubleshooting
### Issue: Build Fails
**Solution**: Check GitHub Actions logs for errors
### Issue: GPU Not Detected
**Solution**: Verify `hardware: t4-small` in Space README.md
### Issue: Out of Memory
**Solution**:
- Reduce batch size in embeddings
- Use `--max-old-space-size=4096` flag
- Upgrade to `t4-medium`
### Issue: Slow Startup
**Solution**:
- Normal! GPU containers take 2-3 minutes to boot
- Check "Logs" tab for progress
---
## πŸ“ˆ Alternative GPU Platforms
If you need more GPU power:
### **Modal Labs** (Serverless GPU)
- A100 GPUs (40GB/80GB)
- Pay per second
- Easy Python/Node.js deployment
### **Railway** (GPU Add-on)
- NVIDIA A10G (24GB)
- $10-50/month
- Better for production
### **Runpod** (Cheap GPU)
- A40/A100 available
- $0.39/hr for A40
- Full Docker support
---
## βœ… Success Checklist
- [ ] Hugging Face account created
- [ ] Space created with GPU hardware
- [ ] GitHub secrets added (HF_TOKEN, HF_SPACE_NAME)
- [ ] Workflow file committed
- [ ] First deployment triggered
- [ ] App accessible at HF Space URL
- [ ] GPU detected in logs
- [ ] API endpoints responding
---
## πŸŽ‰ You're Done!
Your WidgeTDC platform now runs on **FREE GPU** infrastructure! πŸš€
**Next Steps:**
- Monitor performance in HF Spaces
- Add more AI models
- Scale to paid tier if needed
- Enjoy 10x faster AI inference!