Spaces:
Paused
Paused
File size: 5,434 Bytes
5a81b95 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 | # π GPU Deployment Guide - Hugging Face Spaces
## Overview
Deploy WidgeTDC backend to Hugging Face Spaces with **FREE GPU** (NVIDIA T4 16GB).
---
## π Prerequisites
1. **Hugging Face Account**
- Sign up at: https://huggingface.co/join
- Free tier includes GPU access!
2. **GitHub Repository Secrets**
- Go to: `Settings` β `Secrets and variables` β `Actions`
- Add the following secrets
---
## π Step 1: Get Hugging Face Token
1. Go to: https://huggingface.co/settings/tokens
2. Click **"New token"**
3. Name: `GitHub Actions Deploy`
4. Type: **Write** access
5. Copy the token
---
## ποΈ Step 2: Create Hugging Face Space
1. Go to: https://huggingface.co/new-space
2. Fill in:
- **Owner**: Your username
- **Space name**: `widgetdc` (or your choice)
- **License**: Apache 2.0
- **SDK**: Docker
- **Hardware**: **T4 small (GPU)**
- **Visibility**: Private (or Public)
3. Click **"Create Space"**
---
## π Step 3: Add GitHub Secrets
Go to your GitHub repo β Settings β Secrets β Actions:
### Add Secret 1: `HF_TOKEN`
```
Value: <paste your Hugging Face token from Step 1>
```
### Add Secret 2: `HF_SPACE_NAME`
```
Value: YOUR_USERNAME/widgetdc
Example: clauskraft/widgetdc
```
### Optional Secrets for Production:
```
GEMINI_API_KEY=<your Gemini API key>
NEO4J_URI=<your Neo4j connection string>
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your password>
POSTGRES_HOST=<your postgres host>
DATABASE_URL=<your postgres connection string>
```
---
## π Step 4: Deploy!
### Automatic Deploy (on every push to main):
```bash
git push origin main
```
### Manual Deploy:
1. Go to GitHub β Actions tab
2. Select **"Deploy to Hugging Face (GPU)"**
3. Click **"Run workflow"**
4. Select branch: `main`
5. Click **"Run workflow"**
---
## π Step 5: Monitor Deployment
1. **Check GitHub Actions**:
- https://github.com/YOUR_USERNAME/WidgeTDC/actions
2. **Check Hugging Face Logs**:
- Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/widgetdc
- Click **"Logs"** tab
- Watch real-time build progress
3. **Access Your App**:
- URL: `https://YOUR_USERNAME-widgetdc.hf.space`
- API: `https://YOUR_USERNAME-widgetdc.hf.space/api`
---
## π― GPU Benefits
### What You Get:
- β
**NVIDIA T4 GPU** (16GB VRAM)
- β
**CUDA 12.2** enabled
- β
**PyTorch** pre-installed
- β
**Sentence Transformers** for embeddings
- β
**10x faster** AI inference
- β
**FREE** on Hugging Face Community
### What Runs on GPU:
1. **Vector Embeddings** - Sentence transformers
2. **Knowledge Graph Embeddings** - Node2Vec, GraphSAGE
3. **LLM Inference** - Gemini/local models
4. **Semantic Search** - FAISS/pgvector with GPU
5. **Entity Recognition** - NER models
---
## π§ Configuration
### Environment Variables in HF Space:
Go to Space β Settings β Variables:
```bash
NODE_ENV=production
PORT=7860
USE_GPU=true
GEMINI_API_KEY=<your-key>
NEO4J_URI=<neo4j-uri>
DATABASE_URL=<postgres-url>
```
### GPU Settings in Space:
Edit `README.md` in your Space:
```yaml
---
title: WidgeTDC Neural Platform
sdk: docker
hardware: t4-small # Options: cpu-basic, t4-small, t4-medium, a10g-small
---
```
**Hardware Options:**
- `cpu-basic` - Free, no GPU
- `t4-small` - Free GPU, NVIDIA T4, 16GB
- `t4-medium` - Paid, 2x T4
- `a10g-small` - Paid, NVIDIA A10G, 24GB
---
## π§ͺ Test GPU Deployment
### 1. Check GPU Availability:
```bash
curl https://YOUR_USERNAME-widgetdc.hf.space/health
```
### 2. Test Embedding Generation:
```bash
curl -X POST https://YOUR_USERNAME-widgetdc.hf.space/api/srag/query \
-H "Content-Type: application/json" \
-d '{"query": "What is AI?", "limit": 5}'
```
### 3. Monitor GPU Usage:
Check HF Space logs for:
```
β
GPU Available: NVIDIA T4
β
CUDA Version: 12.2
β
PyTorch GPU: True
```
---
## π Update Deployment
To update your deployed app:
1. Make changes locally
2. Commit and push:
```bash
git add .
git commit -m "feat: your changes"
git push origin main
```
3. GitHub Actions auto-deploys to HF Spaces
4. Watch logs in Actions tab
---
## π Troubleshooting
### Issue: Build Fails
**Solution**: Check GitHub Actions logs for errors
### Issue: GPU Not Detected
**Solution**: Verify `hardware: t4-small` in Space README.md
### Issue: Out of Memory
**Solution**:
- Reduce batch size in embeddings
- Use `--max-old-space-size=4096` flag
- Upgrade to `t4-medium`
### Issue: Slow Startup
**Solution**:
- Normal! GPU containers take 2-3 minutes to boot
- Check "Logs" tab for progress
---
## π Alternative GPU Platforms
If you need more GPU power:
### **Modal Labs** (Serverless GPU)
- A100 GPUs (40GB/80GB)
- Pay per second
- Easy Python/Node.js deployment
### **Railway** (GPU Add-on)
- NVIDIA A10G (24GB)
- $10-50/month
- Better for production
### **Runpod** (Cheap GPU)
- A40/A100 available
- $0.39/hr for A40
- Full Docker support
---
## β
Success Checklist
- [ ] Hugging Face account created
- [ ] Space created with GPU hardware
- [ ] GitHub secrets added (HF_TOKEN, HF_SPACE_NAME)
- [ ] Workflow file committed
- [ ] First deployment triggered
- [ ] App accessible at HF Space URL
- [ ] GPU detected in logs
- [ ] API endpoints responding
---
## π You're Done!
Your WidgeTDC platform now runs on **FREE GPU** infrastructure! π
**Next Steps:**
- Monitor performance in HF Spaces
- Add more AI models
- Scale to paid tier if needed
- Enjoy 10x faster AI inference!
|