# ๐Ÿš€ GPU Deployment Guide - Hugging Face Spaces ## Overview Deploy WidgeTDC backend to Hugging Face Spaces with **FREE GPU** (NVIDIA T4 16GB). --- ## ๐Ÿ“‹ Prerequisites 1. **Hugging Face Account** - Sign up at: https://huggingface.co/join - Free tier includes GPU access! 2. **GitHub Repository Secrets** - Go to: `Settings` โ†’ `Secrets and variables` โ†’ `Actions` - Add the following secrets --- ## ๐Ÿ” Step 1: Get Hugging Face Token 1. Go to: https://huggingface.co/settings/tokens 2. Click **"New token"** 3. Name: `GitHub Actions Deploy` 4. Type: **Write** access 5. Copy the token --- ## ๐Ÿ—๏ธ Step 2: Create Hugging Face Space 1. Go to: https://huggingface.co/new-space 2. Fill in: - **Owner**: Your username - **Space name**: `widgetdc` (or your choice) - **License**: Apache 2.0 - **SDK**: Docker - **Hardware**: **T4 small (GPU)** - **Visibility**: Private (or Public) 3. Click **"Create Space"** --- ## ๐Ÿ”‘ Step 3: Add GitHub Secrets Go to your GitHub repo โ†’ Settings โ†’ Secrets โ†’ Actions: ### Add Secret 1: `HF_TOKEN` ``` Value: ``` ### Add Secret 2: `HF_SPACE_NAME` ``` Value: YOUR_USERNAME/widgetdc Example: clauskraft/widgetdc ``` ### Optional Secrets for Production: ``` GEMINI_API_KEY= NEO4J_URI= NEO4J_USER=neo4j NEO4J_PASSWORD= POSTGRES_HOST= DATABASE_URL= ``` --- ## ๐Ÿš€ Step 4: Deploy! ### Automatic Deploy (on every push to main): ```bash git push origin main ``` ### Manual Deploy: 1. Go to GitHub โ†’ Actions tab 2. Select **"Deploy to Hugging Face (GPU)"** 3. Click **"Run workflow"** 4. Select branch: `main` 5. Click **"Run workflow"** --- ## ๐Ÿ“Š Step 5: Monitor Deployment 1. **Check GitHub Actions**: - https://github.com/YOUR_USERNAME/WidgeTDC/actions 2. **Check Hugging Face Logs**: - Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/widgetdc - Click **"Logs"** tab - Watch real-time build progress 3. **Access Your App**: - URL: `https://YOUR_USERNAME-widgetdc.hf.space` - API: `https://YOUR_USERNAME-widgetdc.hf.space/api` --- ## ๐ŸŽฏ GPU Benefits ### What You Get: - โœ… **NVIDIA T4 GPU** (16GB VRAM) - โœ… **CUDA 12.2** enabled - โœ… **PyTorch** pre-installed - โœ… **Sentence Transformers** for embeddings - โœ… **10x faster** AI inference - โœ… **FREE** on Hugging Face Community ### What Runs on GPU: 1. **Vector Embeddings** - Sentence transformers 2. **Knowledge Graph Embeddings** - Node2Vec, GraphSAGE 3. **LLM Inference** - Gemini/local models 4. **Semantic Search** - FAISS/pgvector with GPU 5. **Entity Recognition** - NER models --- ## ๐Ÿ”ง Configuration ### Environment Variables in HF Space: Go to Space โ†’ Settings โ†’ Variables: ```bash NODE_ENV=production PORT=7860 USE_GPU=true GEMINI_API_KEY= NEO4J_URI= DATABASE_URL= ``` ### GPU Settings in Space: Edit `README.md` in your Space: ```yaml --- title: WidgeTDC Neural Platform sdk: docker hardware: t4-small # Options: cpu-basic, t4-small, t4-medium, a10g-small --- ``` **Hardware Options:** - `cpu-basic` - Free, no GPU - `t4-small` - Free GPU, NVIDIA T4, 16GB - `t4-medium` - Paid, 2x T4 - `a10g-small` - Paid, NVIDIA A10G, 24GB --- ## ๐Ÿงช Test GPU Deployment ### 1. Check GPU Availability: ```bash curl https://YOUR_USERNAME-widgetdc.hf.space/health ``` ### 2. Test Embedding Generation: ```bash curl -X POST https://YOUR_USERNAME-widgetdc.hf.space/api/srag/query \ -H "Content-Type: application/json" \ -d '{"query": "What is AI?", "limit": 5}' ``` ### 3. Monitor GPU Usage: Check HF Space logs for: ``` โœ… GPU Available: NVIDIA T4 โœ… CUDA Version: 12.2 โœ… PyTorch GPU: True ``` --- ## ๐Ÿ”„ Update Deployment To update your deployed app: 1. Make changes locally 2. Commit and push: ```bash git add . git commit -m "feat: your changes" git push origin main ``` 3. GitHub Actions auto-deploys to HF Spaces 4. Watch logs in Actions tab --- ## ๐Ÿ› Troubleshooting ### Issue: Build Fails **Solution**: Check GitHub Actions logs for errors ### Issue: GPU Not Detected **Solution**: Verify `hardware: t4-small` in Space README.md ### Issue: Out of Memory **Solution**: - Reduce batch size in embeddings - Use `--max-old-space-size=4096` flag - Upgrade to `t4-medium` ### Issue: Slow Startup **Solution**: - Normal! GPU containers take 2-3 minutes to boot - Check "Logs" tab for progress --- ## ๐Ÿ“ˆ Alternative GPU Platforms If you need more GPU power: ### **Modal Labs** (Serverless GPU) - A100 GPUs (40GB/80GB) - Pay per second - Easy Python/Node.js deployment ### **Railway** (GPU Add-on) - NVIDIA A10G (24GB) - $10-50/month - Better for production ### **Runpod** (Cheap GPU) - A40/A100 available - $0.39/hr for A40 - Full Docker support --- ## โœ… Success Checklist - [ ] Hugging Face account created - [ ] Space created with GPU hardware - [ ] GitHub secrets added (HF_TOKEN, HF_SPACE_NAME) - [ ] Workflow file committed - [ ] First deployment triggered - [ ] App accessible at HF Space URL - [ ] GPU detected in logs - [ ] API endpoints responding --- ## ๐ŸŽ‰ You're Done! Your WidgeTDC platform now runs on **FREE GPU** infrastructure! ๐Ÿš€ **Next Steps:** - Monitor performance in HF Spaces - Add more AI models - Scale to paid tier if needed - Enjoy 10x faster AI inference!