Spaces:

Kraft102
/

widgettdc-api

Paused

App Files Files Community

widgettdc-api / docs /GPU_DEPLOYMENT.md

Kraft102

fix: sql.js Docker/Alpine compatibility layer for PatternMemory and FailureMemory

5a81b95 2 months ago

preview code

raw

history blame contribute delete

5.43 kB

	# 🚀 GPU Deployment Guide - Hugging Face Spaces

	## Overview
	Deploy WidgeTDC backend to Hugging Face Spaces with FREE GPU (NVIDIA T4 16GB).

	---

	## 📋 Prerequisites

	1. Hugging Face Account
	- Sign up at: https://huggingface.co/join
	- Free tier includes GPU access!

	2. GitHub Repository Secrets
	- Go to: `Settings` → `Secrets and variables` → `Actions`
	- Add the following secrets

	---

	## 🔐 Step 1: Get Hugging Face Token

	1. Go to: https://huggingface.co/settings/tokens
	2. Click "New token"
	3. Name: `GitHub Actions Deploy`
	4. Type: Write access
	5. Copy the token

	---

	## 🏗️ Step 2: Create Hugging Face Space

	1. Go to: https://huggingface.co/new-space
	2. Fill in:
	- Owner: Your username
	- Space name: `widgetdc` (or your choice)
	- License: Apache 2.0
	- SDK: Docker
	- Hardware: T4 small (GPU)
	- Visibility: Private (or Public)
	3. Click "Create Space"

	---

	## 🔑 Step 3: Add GitHub Secrets

	Go to your GitHub repo → Settings → Secrets → Actions:

	### Add Secret 1: `HF_TOKEN`
	```
	Value: <paste your Hugging Face token from Step 1>
	```

	### Add Secret 2: `HF_SPACE_NAME`
	```
	Value: YOUR_USERNAME/widgetdc
	Example: clauskraft/widgetdc
	```

	### Optional Secrets for Production:
	```
	GEMINI_API_KEY=<your Gemini API key>
	NEO4J_URI=<your Neo4j connection string>
	NEO4J_USER=neo4j
	NEO4J_PASSWORD=<your password>
	POSTGRES_HOST=<your postgres host>
	DATABASE_URL=<your postgres connection string>
	```

	---

	## 🚀 Step 4: Deploy!

	### Automatic Deploy (on every push to main):
	```bash
	git push origin main
	```

	### Manual Deploy:
	1. Go to GitHub → Actions tab
	2. Select "Deploy to Hugging Face (GPU)"
	3. Click "Run workflow"
	4. Select branch: `main`
	5. Click "Run workflow"

	---

	## 📊 Step 5: Monitor Deployment

	1. Check GitHub Actions:
	- https://github.com/YOUR_USERNAME/WidgeTDC/actions

	2. Check Hugging Face Logs:
	- Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/widgetdc
	- Click "Logs" tab
	- Watch real-time build progress

	3. Access Your App:
	- URL: `https://YOUR_USERNAME-widgetdc.hf.space`
	- API: `https://YOUR_USERNAME-widgetdc.hf.space/api`

	---

	## 🎯 GPU Benefits

	### What You Get:
	- ✅ NVIDIA T4 GPU (16GB VRAM)
	- ✅ CUDA 12.2 enabled
	- ✅ PyTorch pre-installed
	- ✅ Sentence Transformers for embeddings
	- ✅ 10x faster AI inference
	- ✅ FREE on Hugging Face Community

	### What Runs on GPU:
	1. Vector Embeddings - Sentence transformers
	2. Knowledge Graph Embeddings - Node2Vec, GraphSAGE
	3. LLM Inference - Gemini/local models
	4. Semantic Search - FAISS/pgvector with GPU
	5. Entity Recognition - NER models

	---

	## 🔧 Configuration

	### Environment Variables in HF Space:

	Go to Space → Settings → Variables:

	```bash
	NODE_ENV=production
	PORT=7860
	USE_GPU=true
	GEMINI_API_KEY=<your-key>
	NEO4J_URI=<neo4j-uri>
	DATABASE_URL=<postgres-url>
	```

	### GPU Settings in Space:

	Edit `README.md` in your Space:
	```yaml
	---
	title: WidgeTDC Neural Platform
	sdk: docker
	hardware: t4-small # Options: cpu-basic, t4-small, t4-medium, a10g-small
	---
	```

	Hardware Options:
	- `cpu-basic` - Free, no GPU
	- `t4-small` - Free GPU, NVIDIA T4, 16GB
	- `t4-medium` - Paid, 2x T4
	- `a10g-small` - Paid, NVIDIA A10G, 24GB

	---

	## 🧪 Test GPU Deployment

	### 1. Check GPU Availability:
	```bash
	curl https://YOUR_USERNAME-widgetdc.hf.space/health
	```

	### 2. Test Embedding Generation:
	```bash
	curl -X POST https://YOUR_USERNAME-widgetdc.hf.space/api/srag/query \
	-H "Content-Type: application/json" \
	-d '{"query": "What is AI?", "limit": 5}'
	```

	### 3. Monitor GPU Usage:
	Check HF Space logs for:
	```
	✅ GPU Available: NVIDIA T4
	✅ CUDA Version: 12.2
	✅ PyTorch GPU: True
	```

	---

	## 🔄 Update Deployment

	To update your deployed app:

	1. Make changes locally
	2. Commit and push:
	```bash
	git add .
	git commit -m "feat: your changes"
	git push origin main
	```
	3. GitHub Actions auto-deploys to HF Spaces
	4. Watch logs in Actions tab

	---

	## 🐛 Troubleshooting

	### Issue: Build Fails
	Solution: Check GitHub Actions logs for errors

	### Issue: GPU Not Detected
	Solution: Verify `hardware: t4-small` in Space README.md

	### Issue: Out of Memory
	Solution:
	- Reduce batch size in embeddings
	- Use `--max-old-space-size=4096` flag
	- Upgrade to `t4-medium`

	### Issue: Slow Startup
	Solution:
	- Normal! GPU containers take 2-3 minutes to boot
	- Check "Logs" tab for progress

	---

	## 📈 Alternative GPU Platforms

	If you need more GPU power:

	### Modal Labs (Serverless GPU)
	- A100 GPUs (40GB/80GB)
	- Pay per second
	- Easy Python/Node.js deployment

	### Railway (GPU Add-on)
	- NVIDIA A10G (24GB)
	- $10-50/month
	- Better for production

	### Runpod (Cheap GPU)
	- A40/A100 available
	- $0.39/hr for A40
	- Full Docker support

	---

	## ✅ Success Checklist

	- [ ] Hugging Face account created
	- [ ] Space created with GPU hardware
	- [ ] GitHub secrets added (HF_TOKEN, HF_SPACE_NAME)
	- [ ] Workflow file committed
	- [ ] First deployment triggered
	- [ ] App accessible at HF Space URL
	- [ ] GPU detected in logs
	- [ ] API endpoints responding

	---

	## 🎉 You're Done!

	Your WidgeTDC platform now runs on FREE GPU infrastructure! 🚀

	Next Steps:
	- Monitor performance in HF Spaces
	- Add more AI models
	- Scale to paid tier if needed
	- Enjoy 10x faster AI inference!