Spaces:
Sleeping
Sleeping
metadata
title: RAG Onboarding Backend
emoji: 🚀
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: mit
RAG Onboarding Backend - HuggingFace Space
Production-ready RAG (Retrieval-Augmented Generation) backend for corporate employee onboarding, deployed on HuggingFace Spaces.
🌟 Features
- FastAPI REST API - High-performance async API
- HuggingFace Models - Open-source LLMs and embeddings
- Vector Search - Qdrant for similarity search
- Caching - Redis for performance optimization
- Monitoring - Prometheus metrics
- Clean Architecture - Production-grade code structure
🚀 Quick Start
API Endpoints
GET /- Service infoGET /health- Health checkPOST /api/v1/query- RAG query processingGET /api/v1/metrics- Prometheus metricsGET /docs- Interactive API documentation
Example Query
curl -X POST "https://YOUR-SPACE-NAME.hf.space/api/v1/query" \
-H "Content-Type: application/json" \
-d '{
"query_text": "What is the onboarding process?",
"department": "HR",
"top_k": 5
}'
🔧 Configuration
Set the following secrets in your HuggingFace Space settings:
GEMINI_API_KEY- Your Google Gemini API keyDATABASE_URL- PostgreSQL connection string (use external DB like Supabase/Neon)REDIS_URL- Redis connection string (use Upstash Redis)QDRANT_URL- Qdrant vector DB URL (use Qdrant Cloud)QDRANT_API_KEY- Qdrant API key (if using cloud)
📊 Models Used
- LLM: Google Gemini 2.0 Flash (via API)
- Embeddings:
sentence-transformers/all-MiniLM-L6-v2 - Reranking:
cross-encoder/ms-marco-MiniLM-L-12-v2(optional)
🏗️ Architecture
┌─────────────┐
│ Client │
└──────┬──────┘
│
▼
┌─────────────┐
│ FastAPI │
└──────┬──────┘
│
├─────► PostgreSQL (Documents)
├─────► Redis (Cache)
├─────► Qdrant (Vectors)
└─────► Google Gemini (LLM)
📝 License
MIT License - See LICENSE file for details
🤝 Contributing
Contributions welcome! Please open an issue or submit a PR.
📧 Support
For questions or issues, please open a GitHub issue.