yuvrajsingh6
fix(meta): correct huggingface spaces sdk version format
81a5ce2

A newer version of the Streamlit SDK is available: 1.54.0

Upgrade
metadata
title: Enterprise RAG System
emoji: πŸš€
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.32.2
app_file: app.py
pinned: false

πŸ” Enterprise RAG System

A production-ready Retrieval Augmented Generation system featuring Hybrid Search, Reranking, and Hallucination Prevention.

Live Demo

🌟 Key Differentiators

Unlike basic RAG tutorials, this system handles real-world edge cases:

  1. Hybrid Search (BM25 + Semantic): accurately retrieves both specific keywords (IDs, names) and conceptual matches.
  2. Safety First: Implements Confidence Gatingβ€”the system explicitly refuses to answer if retrieved context is insufficient, preventing hallucinations.
  3. Zero-Latency Deployment: Uses a custom Build-Time Artifact Injection pipeline to bake index files into the Docker container, eliminating startup delays.

πŸ› οΈ Architecture

graph LR
    User[User Query] --> A[Hybrid Retriever]
    A -->|Keywords| B(BM25 Index)
    A -->|Semantics| C(Pinecone/FAISS)
    B & C --> D[Rank Fusion (RRF)]
    D --> E[Cross-Encoder Reranker]
    E --> F{Confidence Check}
    F -->|Low Score| G[Fallback Response]
    F -->|High Score| H[LLM Generation]

πŸš€ Quick Start

Local Development

# 1. Install Dependencies
pip install -r requirements.txt

# 2. Generate Index
python src/ingestion/ingest.py

# 3. Run App
streamlit run app.py

Deployment Strategy

We treat Data and Code separately for scalability:

  • Code: GitHub (app.py, src/)
  • Artifacts: Hugging Face Datasets (data/index/)

The Dockerfile automatically fetches the latest index during build, ensuring the deployed container is always ready-to-serve.

πŸ§ͺ Tech Stack

  • LlamaIndex / Custom Pipeline: Hybrid Retrieval Logic
  • Pinecone: Serverless Vector Database
  • Sentence-Transformers: Embeddings & Reranking
  • Streamlit: Conversational UI
  • Docker: Containerized Deployment