Spaces:
Sleeping
Sleeping
File size: 2,100 Bytes
e3170e0 81a5ce2 e3170e0 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d 9c4c212 f2e4a7d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | ---
title: Enterprise RAG System
emoji: π
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.32.2
app_file: app.py
pinned: false
---
# π Enterprise RAG System
> **A production-ready Retrieval Augmented Generation system featuring Hybrid Search, Reranking, and Hallucination Prevention.**
[](https://huggingface.co/spaces/yuvis/Enterprise-RAG-System)
## π Key Differentiators
Unlike basic RAG tutorials, this system handles real-world edge cases:
1. **Hybrid Search (BM25 + Semantic)**: accurately retrieves both specific keywords (IDs, names) and conceptual matches.
2. **Safety First**: Implements **Confidence Gating**βthe system explicitly refuses to answer if retrieved context is insufficient, preventing hallucinations.
3. **Zero-Latency Deployment**: Uses a custom **Build-Time Artifact Injection** pipeline to bake index files into the Docker container, eliminating startup delays.
## π οΈ Architecture
```mermaid
graph LR
User[User Query] --> A[Hybrid Retriever]
A -->|Keywords| B(BM25 Index)
A -->|Semantics| C(Pinecone/FAISS)
B & C --> D[Rank Fusion (RRF)]
D --> E[Cross-Encoder Reranker]
E --> F{Confidence Check}
F -->|Low Score| G[Fallback Response]
F -->|High Score| H[LLM Generation]
```
## π Quick Start
### Local Development
```bash
# 1. Install Dependencies
pip install -r requirements.txt
# 2. Generate Index
python src/ingestion/ingest.py
# 3. Run App
streamlit run app.py
```
### Deployment Strategy
We treat Data and Code separately for scalability:
- **Code**: GitHub (`app.py`, `src/`)
- **Artifacts**: Hugging Face Datasets (`data/index/`)
The `Dockerfile` automatically fetches the latest index during build, ensuring the deployed container is always ready-to-serve.
## π§ͺ Tech Stack
- **LlamaIndex / Custom Pipeline**: Hybrid Retrieval Logic
- **Pinecone**: Serverless Vector Database
- **Sentence-Transformers**: Embeddings & Reranking
- **Streamlit**: Conversational UI
- **Docker**: Containerized Deployment
|