--- title: RAG Onboarding Backend emoji: 🚀 colorFrom: blue colorTo: green sdk: docker pinned: false license: mit --- # RAG Onboarding Backend - HuggingFace Space Production-ready RAG (Retrieval-Augmented Generation) backend for corporate employee onboarding, deployed on HuggingFace Spaces. ## 🌟 Features - **FastAPI REST API** - High-performance async API - **HuggingFace Models** - Open-source LLMs and embeddings - **Vector Search** - Qdrant for similarity search - **Caching** - Redis for performance optimization - **Monitoring** - Prometheus metrics - **Clean Architecture** - Production-grade code structure ## 🚀 Quick Start ### API Endpoints - `GET /` - Service info - `GET /health` - Health check - `POST /api/v1/query` - RAG query processing - `GET /api/v1/metrics` - Prometheus metrics - `GET /docs` - Interactive API documentation ### Example Query ```bash curl -X POST "https://YOUR-SPACE-NAME.hf.space/api/v1/query" \ -H "Content-Type: application/json" \ -d '{ "query_text": "What is the onboarding process?", "department": "HR", "top_k": 5 }' ``` ## 🔧 Configuration Set the following secrets in your HuggingFace Space settings: - `GEMINI_API_KEY` - Your Google Gemini API key - `DATABASE_URL` - PostgreSQL connection string (use external DB like Supabase/Neon) - `REDIS_URL` - Redis connection string (use Upstash Redis) - `QDRANT_URL` - Qdrant vector DB URL (use Qdrant Cloud) - `QDRANT_API_KEY` - Qdrant API key (if using cloud) ## 📊 Models Used - **LLM**: Google Gemini 2.0 Flash (via API) - **Embeddings**: `sentence-transformers/all-MiniLM-L6-v2` - **Reranking**: `cross-encoder/ms-marco-MiniLM-L-12-v2` (optional) ## 🏗️ Architecture ``` ┌─────────────┐ │ Client │ └──────┬──────┘ │ ▼ ┌─────────────┐ │ FastAPI │ └──────┬──────┘ │ ├─────► PostgreSQL (Documents) ├─────► Redis (Cache) ├─────► Qdrant (Vectors) └─────► Google Gemini (LLM) ``` ## 📝 License MIT License - See LICENSE file for details ## 🤝 Contributing Contributions welcome! Please open an issue or submit a PR. ## 📧 Support For questions or issues, please open a GitHub issue.