Spaces:

Kraft102
/

widgettdc-api

Paused

File size: 5,434 Bytes

5a81b95

# 🚀 GPU Deployment Guide - Hugging Face Spaces

## Overview
Deploy WidgeTDC backend to Hugging Face Spaces with **FREE GPU** (NVIDIA T4 16GB).

---

## 📋 Prerequisites

1. **Hugging Face Account**
   - Sign up at: https://huggingface.co/join
   - Free tier includes GPU access!

2. **GitHub Repository Secrets**
   - Go to: `Settings` → `Secrets and variables` → `Actions`
   - Add the following secrets

---

## 🔐 Step 1: Get Hugging Face Token

1. Go to: https://huggingface.co/settings/tokens
2. Click **"New token"**
3. Name: `GitHub Actions Deploy`
4. Type: **Write** access
5. Copy the token

---

## 🏗️ Step 2: Create Hugging Face Space

1. Go to: https://huggingface.co/new-space
2. Fill in:
   - **Owner**: Your username
   - **Space name**: `widgetdc` (or your choice)
   - **License**: Apache 2.0
   - **SDK**: Docker
   - **Hardware**: **T4 small (GPU)**
   - **Visibility**: Private (or Public)
3. Click **"Create Space"**

---

## 🔑 Step 3: Add GitHub Secrets

Go to your GitHub repo → Settings → Secrets → Actions:

### Add Secret 1: `HF_TOKEN`
```
Value: <paste your Hugging Face token from Step 1>
```

### Add Secret 2: `HF_SPACE_NAME`
```
Value: YOUR_USERNAME/widgetdc
Example: clauskraft/widgetdc
```

### Optional Secrets for Production:
```
GEMINI_API_KEY=<your Gemini API key>
NEO4J_URI=<your Neo4j connection string>
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your password>
POSTGRES_HOST=<your postgres host>
DATABASE_URL=<your postgres connection string>
```

---

## 🚀 Step 4: Deploy!

### Automatic Deploy (on every push to main):
```bash
git push origin main
```

### Manual Deploy:
1. Go to GitHub → Actions tab
2. Select **"Deploy to Hugging Face (GPU)"**
3. Click **"Run workflow"**
4. Select branch: `main`
5. Click **"Run workflow"**

---

## 📊 Step 5: Monitor Deployment

1. **Check GitHub Actions**:
   - https://github.com/YOUR_USERNAME/WidgeTDC/actions

2. **Check Hugging Face Logs**:
   - Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/widgetdc
   - Click **"Logs"** tab
   - Watch real-time build progress

3. **Access Your App**:
   - URL: `https://YOUR_USERNAME-widgetdc.hf.space`
   - API: `https://YOUR_USERNAME-widgetdc.hf.space/api`

---

## 🎯 GPU Benefits

### What You Get:
- ✅ **NVIDIA T4 GPU** (16GB VRAM)
- ✅ **CUDA 12.2** enabled
- ✅ **PyTorch** pre-installed
- ✅ **Sentence Transformers** for embeddings
- ✅ **10x faster** AI inference
- ✅ **FREE** on Hugging Face Community

### What Runs on GPU:
1. **Vector Embeddings** - Sentence transformers
2. **Knowledge Graph Embeddings** - Node2Vec, GraphSAGE
3. **LLM Inference** - Gemini/local models
4. **Semantic Search** - FAISS/pgvector with GPU
5. **Entity Recognition** - NER models

---

## 🔧 Configuration

### Environment Variables in HF Space:

Go to Space → Settings → Variables:

```bash
NODE_ENV=production
PORT=7860
USE_GPU=true
GEMINI_API_KEY=<your-key>
NEO4J_URI=<neo4j-uri>
DATABASE_URL=<postgres-url>
```

### GPU Settings in Space:

Edit `README.md` in your Space:
```yaml
---
title: WidgeTDC Neural Platform
sdk: docker
hardware: t4-small  # Options: cpu-basic, t4-small, t4-medium, a10g-small
---
```

**Hardware Options:**
- `cpu-basic` - Free, no GPU
- `t4-small` - Free GPU, NVIDIA T4, 16GB
- `t4-medium` - Paid, 2x T4
- `a10g-small` - Paid, NVIDIA A10G, 24GB

---

## 🧪 Test GPU Deployment

### 1. Check GPU Availability:
```bash
curl https://YOUR_USERNAME-widgetdc.hf.space/health
```

### 2. Test Embedding Generation:
```bash
curl -X POST https://YOUR_USERNAME-widgetdc.hf.space/api/srag/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What is AI?", "limit": 5}'
```

### 3. Monitor GPU Usage:
Check HF Space logs for:
```
✅ GPU Available: NVIDIA T4
✅ CUDA Version: 12.2
✅ PyTorch GPU: True
```

---

## 🔄 Update Deployment

To update your deployed app:

1. Make changes locally
2. Commit and push:
```bash
git add .
git commit -m "feat: your changes"
git push origin main
```
3. GitHub Actions auto-deploys to HF Spaces
4. Watch logs in Actions tab

---

## 🐛 Troubleshooting

### Issue: Build Fails
**Solution**: Check GitHub Actions logs for errors

### Issue: GPU Not Detected
**Solution**: Verify `hardware: t4-small` in Space README.md

### Issue: Out of Memory
**Solution**: 
- Reduce batch size in embeddings
- Use `--max-old-space-size=4096` flag
- Upgrade to `t4-medium`

### Issue: Slow Startup
**Solution**: 
- Normal! GPU containers take 2-3 minutes to boot
- Check "Logs" tab for progress

---

## 📈 Alternative GPU Platforms

If you need more GPU power:

### **Modal Labs** (Serverless GPU)
- A100 GPUs (40GB/80GB)
- Pay per second
- Easy Python/Node.js deployment

### **Railway** (GPU Add-on)
- NVIDIA A10G (24GB)
- $10-50/month
- Better for production

### **Runpod** (Cheap GPU)
- A40/A100 available
- $0.39/hr for A40
- Full Docker support

---

## ✅ Success Checklist

- [ ] Hugging Face account created
- [ ] Space created with GPU hardware
- [ ] GitHub secrets added (HF_TOKEN, HF_SPACE_NAME)
- [ ] Workflow file committed
- [ ] First deployment triggered
- [ ] App accessible at HF Space URL
- [ ] GPU detected in logs
- [ ] API endpoints responding

---

## 🎉 You're Done!

Your WidgeTDC platform now runs on **FREE GPU** infrastructure! 🚀

**Next Steps:**
- Monitor performance in HF Spaces
- Add more AI models
- Scale to paid tier if needed
- Enjoy 10x faster AI inference!