Spaces:

Baktabek
/

rag-onboarding-backend

Sleeping

App Files Files Community

Baktabek commited on 24 days ago

Commit

409c17a

verified ·

1 Parent(s): 0766f5c

Upload folder using huggingface_hub

Browse files

Files changed (49) hide show

.gitignore +8 -0
Dockerfile +41 -0
README.md +90 -10
alembic.ini +40 -0
alembic/env.py +66 -0
alembic/script.py.mako +1 -0
alembic/versions/001_initial_schema.py +95 -0
app.py +23 -0
app/__init__.py +5 -0
app/application/__init__.py +1 -0
app/application/dto/__init__.py +79 -0
app/application/services/__init__.py +5 -0
app/application/services/chunking_service.py +97 -0
app/application/use_cases/__init__.py +1 -0
app/application/use_cases/document_indexing.py +129 -0
app/application/use_cases/query_processing.py +136 -0
app/core/__init__.py +1 -0
app/core/config.py +121 -0
app/core/logging.py +75 -0
app/core/metrics.py +98 -0
app/domain/__init__.py +1 -0
app/domain/entities/__init__.py +15 -0
app/domain/entities/document.py +87 -0
app/domain/entities/query.py +113 -0
app/domain/interfaces/__init__.py +20 -0
app/domain/interfaces/cache.py +36 -0
app/domain/interfaces/llm.py +72 -0
app/domain/interfaces/repository.py +60 -0
app/domain/interfaces/retriever.py +75 -0
app/infrastructure/__init__.py +1 -0
app/infrastructure/cache/__init__.py +1 -0
app/infrastructure/cache/redis_cache.py +84 -0
app/infrastructure/database/__init__.py +1 -0
app/infrastructure/database/models.py +82 -0
app/infrastructure/external/__init__.py +1 -0
app/infrastructure/external/embedder.py +31 -0
app/infrastructure/external/gemini_llm.py +87 -0
app/infrastructure/external/prompt_builder.py +54 -0
app/infrastructure/external/qdrant_retriever.py +124 -0
app/infrastructure/external/simple_reranker.py +20 -0
app/infrastructure/repositories/__init__.py +1 -0
app/infrastructure/repositories/postgres_repository.py +178 -0
app/main.py +84 -0
app/presentation/__init__.py +1 -0
app/presentation/api/__init__.py +1 -0
app/presentation/api/v1/__init__.py +1 -0
app/presentation/api/v1/endpoints.py +168 -0
app/presentation/api/v1/schemas.py +82 -0
requirements.txt +19 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,8 @@

+__pycache__/
+*.py[cod]
+.env
+.env.local
+*.log
+.pytest_cache/
+.coverage
+htmlcov/

Dockerfile ADDED Viewed

	@@ -0,0 +1,41 @@

+# HuggingFace Space Dockerfile
+FROM python:3.11-slim
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    curl \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application code
+COPY app ./app
+COPY app.py .
+COPY alembic ./alembic
+COPY alembic.ini .
+# Create non-root user
+RUN useradd -m -u 1000 user && chown -R user:user /app
+USER user
+# HuggingFace Spaces uses port 7860
+EXPOSE 7860
+# Set environment
+ENV PYTHONUNBUFFERED=1
+ENV PORT=7860
+# Health check
+HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+# Run application
+CMD ["python", "app.py"]

README.md CHANGED Viewed

@@ -1,10 +1,90 @@
----
-title: Rag Onboarding Backend
-emoji: 🌍
-colorFrom: purple
-colorTo: gray
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+---
+title: RAG Onboarding Backend
+emoji: 🚀
+colorFrom: blue
+colorTo: green
+sdk: docker
+pinned: false
+license: mit
+---
+# RAG Onboarding Backend - HuggingFace Space
+Production-ready RAG (Retrieval-Augmented Generation) backend for corporate employee onboarding, deployed on HuggingFace Spaces.
+## 🌟 Features
+- **FastAPI REST API** - High-performance async API
+- **HuggingFace Models** - Open-source LLMs and embeddings
+- **Vector Search** - Qdrant for similarity search
+- **Caching** - Redis for performance optimization
+- **Monitoring** - Prometheus metrics
+- **Clean Architecture** - Production-grade code structure
+## 🚀 Quick Start
+### API Endpoints
+- `GET /` - Service info
+- `GET /health` - Health check
+- `POST /api/v1/query` - RAG query processing
+- `GET /api/v1/metrics` - Prometheus metrics
+- `GET /docs` - Interactive API documentation
+### Example Query
+```bash
+curl -X POST "https://YOUR-SPACE-NAME.hf.space/api/v1/query" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query_text": "What is the onboarding process?",
+    "department": "HR",
+    "top_k": 5
+  }'
+```
+## 🔧 Configuration
+Set the following secrets in your HuggingFace Space settings:
+- `GEMINI_API_KEY` - Your Google Gemini API key
+- `DATABASE_URL` - PostgreSQL connection string (use external DB like Supabase/Neon)
+- `REDIS_URL` - Redis connection string (use Upstash Redis)
+- `QDRANT_URL` - Qdrant vector DB URL (use Qdrant Cloud)
+- `QDRANT_API_KEY` - Qdrant API key (if using cloud)
+## 📊 Models Used
+- **LLM**: Google Gemini 2.0 Flash (via API)
+- **Embeddings**: `sentence-transformers/all-MiniLM-L6-v2`
+- **Reranking**: `cross-encoder/ms-marco-MiniLM-L-12-v2` (optional)
+## 🏗️ Architecture
+```
+┌─────────────┐
+│   Client    │
+└──────┬──────┘
+       │
+       ▼
+┌─────────────┐
+│  FastAPI    │
+└──────┬──────┘
+       │
+       ├─────► PostgreSQL (Documents)
+       ├─────► Redis (Cache)
+       ├─────► Qdrant (Vectors)
+       └─────► Google Gemini (LLM)
+```
+## 📝 License
+MIT License - See LICENSE file for details
+## 🤝 Contributing
+Contributions welcome! Please open an issue or submit a PR.
+## 📧 Support
+For questions or issues, please open a GitHub issue.

alembic.ini ADDED Viewed

	@@ -0,0 +1,40 @@

+[alembic]
+script_location = alembic
+prepend_sys_path = .
+version_path_separator = os
+sqlalchemy.url = postgresql+asyncpg://postgres:postgres@localhost:5432/rag_onboarding
+[loggers]
+keys = root,sqlalchemy,alembic
+[handlers]
+keys = console
+[formatters]
+keys = generic
+[logger_root]
+level = WARN
+handlers = console
+qualname =
+[logger_sqlalchemy]
+level = WARN
+handlers =
+qualname = sqlalchemy.engine
+[logger_alembic]
+level = INFO
+handlers =
+qualname = alembic
+[handler_console]
+class = StreamHandler
+args = (sys.stderr,)
+level = NOTSET
+formatter = generic
+[formatter_generic]
+format = %(levelname)-5.5s [%(name)s] %(message)s
+datefmt = %H:%M:%S

alembic/env.py ADDED Viewed

	@@ -0,0 +1,66 @@

+"""Alembic environment configuration"""
+import asyncio
+from logging.config import fileConfig
+from alembic import context
+from sqlalchemy import pool
+from sqlalchemy.engine import Connection
+from sqlalchemy.ext.asyncio import async_engine_from_config
+from app.infrastructure.database.models import Base
+# Alembic Config object
+config = context.config
+# Interpret the config file for Python logging
+if config.config_file_name is not None:
+    fileConfig(config.config_file_name)
+# Metadata for autogenerate
+target_metadata = Base.metadata
+def run_migrations_offline() -> None:
+    """Run migrations in 'offline' mode."""
+    url = config.get_main_option("sqlalchemy.url")
+    context.configure(
+        url=url,
+        target_metadata=target_metadata,
+        literal_binds=True,
+        dialect_opts={"paramstyle": "named"},
+    )
+    with context.begin_transaction():
+        context.run_migrations()
+def do_run_migrations(connection: Connection) -> None:
+    context.configure(connection=connection, target_metadata=target_metadata)
+    with context.begin_transaction():
+        context.run_migrations()
+async def run_async_migrations() -> None:
+    """Run migrations in 'online' mode."""
+    connectable = async_engine_from_config(
+        config.get_section(config.config_ini_section, {}),
+        prefix="sqlalchemy.",
+        poolclass=pool.NullPool,
+    )
+    async with connectable.connect() as connection:
+        await connection.run_sync(do_run_migrations)
+    await connectable.dispose()
+def run_migrations_online() -> None:
+    """Run migrations in 'online' mode."""
+    asyncio.run(run_async_migrations())
+if context.is_offline_mode():
+    run_migrations_offline()
+else:
+    run_migrations_online()

alembic/script.py.mako ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Script configuration"""

alembic/versions/001_initial_schema.py ADDED Viewed

	@@ -0,0 +1,95 @@

+"""
+Database Migration - Initial Schema
+Create tables for documents, chunks, and queries.
+"""
+from alembic import op
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+# revision identifiers
+revision = '001'
+down_revision = None
+branch_labels = None
+depends_on = None
+def upgrade() -> None:
+    # Create documents table
+    op.create_table(
+        'documents',
+        sa.Column('id', postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column('title', sa.String(500), nullable=False),
+        sa.Column('filename', sa.String(255), nullable=False),
+        sa.Column('file_type', sa.String(50), nullable=False),
+        sa.Column('file_size', sa.BigInteger(), nullable=False),
+        sa.Column('storage_path', sa.String(1000), nullable=False),
+        sa.Column('department', sa.String(100), nullable=False),
+        sa.Column('status', sa.String(50), nullable=False, server_default='pending'),
+        sa.Column('upload_session_id', sa.String(100), nullable=True),
+        sa.Column('uploaded_at', sa.DateTime(timezone=True), nullable=False, server_default=sa.text('now()')),
+        sa.Column('indexed_at', sa.DateTime(timezone=True), nullable=True),
+        sa.Column('metadata', postgresql.JSONB(), nullable=False, server_default='{}'),
+        sa.Column('created_at', sa.DateTime(timezone=True), nullable=False, server_default=sa.text('now()')),
+        sa.Column('updated_at', sa.DateTime(timezone=True), nullable=False, server_default=sa.text('now()')),
+    )
+    # Create indexes for documents
+    op.create_index('ix_documents_title', 'documents', ['title'])
+    op.create_index('ix_documents_file_type', 'documents', ['file_type'])
+    op.create_index('ix_documents_department', 'documents', ['department'])
+    op.create_index('ix_documents_status', 'documents', ['status'])
+    op.create_index('ix_documents_department_status', 'documents', ['department', 'status'])
+    op.create_index('ix_documents_created_at', 'documents', ['created_at'])
+    # Create document_chunks table
+    op.create_table(
+        'document_chunks',
+        sa.Column('id', postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column('document_id', postgresql.UUID(as_uuid=True), nullable=False),
+        sa.Column('chunk_index', sa.Integer(), nullable=False),
+        sa.Column('content', sa.Text(), nullable=False),
+        sa.Column('token_count', sa.Integer(), nullable=False),
+        sa.Column('vector_id', sa.String(100), nullable=True),
+        sa.Column('metadata', postgresql.JSONB(), nullable=False, server_default='{}'),
+        sa.Column('created_at', sa.DateTime(timezone=True), nullable=False, server_default=sa.text('now()')),
+    )
+    # Create indexes for chunks
+    op.create_index('ix_chunks_document_id', 'document_chunks', ['document_id'])
+    op.create_index('ix_chunks_vector_id', 'document_chunks', ['vector_id'])
+    op.create_index('ix_chunks_document_id_index', 'document_chunks', ['document_id', 'chunk_index'])
+    # Create queries table
+    op.create_table(
+        'queries',
+        sa.Column('id', postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column('query_text', sa.Text(), nullable=False),
+        sa.Column('department', sa.String(100), nullable=False),
+        sa.Column('user_id', sa.String(100), nullable=True),
+        sa.Column('session_id', sa.String(100), nullable=True),
+        sa.Column('status', sa.String(50), nullable=False, server_default='pending'),
+        sa.Column('answer', sa.Text(), nullable=True),
+        sa.Column('sources', postgresql.JSONB(), nullable=False, server_default='[]'),
+        sa.Column('confidence', sa.Integer(), nullable=False, server_default='0'),
+        sa.Column('duration_ms', sa.Integer(), nullable=False, server_default='0'),
+        sa.Column('tokens_used', sa.Integer(), nullable=False, server_default='0'),
+        sa.Column('model', sa.String(100), nullable=True),
+        sa.Column('created_at', sa.DateTime(timezone=True), nullable=False, server_default=sa.text('now()')),
+        sa.Column('completed_at', sa.DateTime(timezone=True), nullable=True),
+    )
+    # Create indexes for queries
+    op.create_index('ix_queries_department', 'queries', ['department'])
+    op.create_index('ix_queries_user_id', 'queries', ['user_id'])
+    op.create_index('ix_queries_session_id', 'queries', ['session_id'])
+    op.create_index('ix_queries_status', 'queries', ['status'])
+    op.create_index('ix_queries_created_at', 'queries', ['created_at'])
+    op.create_index('ix_queries_department_created', 'queries', ['department', 'created_at'])
+    op.create_index('ix_queries_user_created', 'queries', ['user_id', 'created_at'])
+def downgrade() -> None:
+    op.drop_table('queries')
+    op.drop_table('document_chunks')
+    op.drop_table('documents')

app.py ADDED Viewed

	@@ -0,0 +1,23 @@

+"""
+HuggingFace Space Entry Point
+Wraps FastAPI app for Gradio/Spaces deployment
+"""
+import os
+import sys
+# Add app to path
+sys.path.insert(0, os.path.dirname(__file__))
+from app.main import app
+# For HF Spaces
+if __name__ == "__main__":
+    import uvicorn
+    port = int(os.getenv("PORT", 7860))  # HF Spaces default port
+    uvicorn.run(
+        app,
+        host="0.0.0.0",
+        port=port,
+        log_level="info"
+    )

app/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+"""
+RAG Onboarding Backend - Production-ready RAG system for employee onboarding
+"""
+__version__ = "1.0.0"

app/application/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Application layer"""

app/application/dto/__init__.py ADDED Viewed

	@@ -0,0 +1,79 @@

+"""
+Application Layer - DTOs (Data Transfer Objects)
+"""
+from dataclasses import dataclass
+from typing import List, Optional
+from uuid import UUID
+@dataclass
+class QueryDTO:
+    """Query data transfer object"""
+    query_text: str
+    department: str
+    user_id: Optional[str] = None
+    session_id: Optional[str] = None
+    top_k: int = 10
+    temperature: float = 0.7
+    max_tokens: int = 2048
+    filters: dict = None
+    def __post_init__(self) -> None:
+        if self.filters is None:
+            self.filters = {}
+@dataclass
+class SourceDTO:
+    """Source citation DTO"""
+    title: str
+    content: str
+    relevance_score: float
+    document_id: str
+    chunk_index: int
+    metadata: dict
+@dataclass
+class QueryResponseDTO:
+    """Query response DTO"""
+    query_id: str
+    answer: str
+    sources: List[SourceDTO]
+    confidence: float
+    processing_time_ms: int
+    tokens_used: int
+    model: str
+@dataclass
+class DocumentUploadDTO:
+    """Document upload DTO"""
+    filename: str
+    content: bytes
+    department: str
+    metadata: dict = None
+    def __post_init__(self) -> None:
+        if self.metadata is None:
+            self.metadata = {}
+@dataclass
+class DocumentDTO:
+    """Document DTO"""
+    id: str
+    title: str
+    filename: str
+    file_type: str
+    file_size: int
+    department: str
+    status: str
+    uploaded_at: str
+    indexed_at: Optional[str] = None
+    metadata: dict = None

app/application/services/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+"""Application services"""
+from app.application.services.chunking_service import ChunkingService
+__all__ = ["ChunkingService"]

app/application/services/chunking_service.py ADDED Viewed

	@@ -0,0 +1,97 @@

+"""
+Application Layer - Chunking Service
+Handles intelligent document chunking.
+"""
+import re
+from typing import List
+from uuid import UUID
+from app.domain.entities import DocumentChunk
+class ChunkingService:
+    """Service for chunking documents intelligently"""
+    def __init__(
+        self,
+        chunk_size: int = 800,
+        chunk_overlap: int = 100,
+        min_chunk_size: int = 100,
+    ):
+        self.chunk_size = chunk_size
+        self.chunk_overlap = chunk_overlap
+        self.min_chunk_size = min_chunk_size
+    async def chunk_text(
+        self, text: str, document_id: UUID, metadata: dict = None
+    ) -> List[DocumentChunk]:
+        """Chunk text using semantic boundaries"""
+        if metadata is None:
+            metadata = {}
+        # 1. Split by paragraphs
+        paragraphs = self._split_paragraphs(text)
+        # 2. Combine into chunks
+        chunks = []
+        current_chunk = []
+        current_size = 0
+        for i, para in enumerate(paragraphs):
+            para_tokens = self._count_tokens(para)
+            if current_size + para_tokens > self.chunk_size and current_chunk:
+                # Flush current chunk
+                chunk_text = "\n\n".join(current_chunk)
+                chunks.append(chunk_text)
+                # Start new chunk with overlap
+                overlap_text = self._get_overlap(current_chunk)
+                current_chunk = [overlap_text, para] if overlap_text else [para]
+                current_size = self._count_tokens("\n\n".join(current_chunk))
+            else:
+                current_chunk.append(para)
+                current_size += para_tokens
+        # Flush remaining
+        if current_chunk:
+            chunks.append("\n\n".join(current_chunk))
+        # 3. Create DocumentChunk entities
+        return [
+            DocumentChunk(
+                document_id=document_id,
+                chunk_index=idx,
+                content=chunk,
+                token_count=self._count_tokens(chunk),
+                metadata=metadata,
+            )
+            for idx, chunk in enumerate(chunks)
+            if self._count_tokens(chunk) >= self.min_chunk_size
+        ]
+    def _split_paragraphs(self, text: str) -> List[str]:
+        """Split text into paragraphs"""
+        # Split by double newlines, headers, etc.
+        paragraphs = re.split(r"\n\s*\n", text)
+        return [p.strip() for p in paragraphs if p.strip()]
+    def _count_tokens(self, text: str) -> int:
+        """Approximate token count (1 token ≈ 4 chars)"""
+        return len(text) // 4
+    def _get_overlap(self, chunks: List[str]) -> str:
+        """Get overlap text from previous chunks"""
+        if not chunks:
+            return ""
+        # Take last chunk and truncate to overlap size
+        last_chunk = chunks[-1]
+        tokens = last_chunk.split()
+        overlap_tokens = int(self.chunk_overlap * 0.25)  # Rough token estimate
+        if len(tokens) <= overlap_tokens:
+            return last_chunk
+        return " ".join(tokens[-overlap_tokens:])

app/application/use_cases/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Use cases"""

app/application/use_cases/document_indexing.py ADDED Viewed

	@@ -0,0 +1,129 @@

+"""
+Application Layer - Document Indexing Use Case
+Handles document upload and indexing into the knowledge base.
+"""
+import hashlib
+from pathlib import Path
+from typing import List
+from uuid import uuid4
+from app.application.dto import DocumentDTO, DocumentUploadDTO
+from app.domain.entities import Document, DocumentChunk, DocumentStatus, DocumentType
+from app.domain.interfaces import IChunkRepository, IDocumentRepository, IEmbedder
+class DocumentIndexingUseCase:
+    """Use case for indexing documents into the knowledge base"""
+    def __init__(
+        self,
+        document_repository: IDocumentRepository,
+        chunk_repository: IChunkRepository,
+        embedder: IEmbedder,
+        chunking_service: "ChunkingService",
+    ):
+        self.document_repository = document_repository
+        self.chunk_repository = chunk_repository
+        self.embedder = embedder
+        self.chunking_service = chunking_service
+    async def execute(self, upload_dto: DocumentUploadDTO) -> DocumentDTO:
+        """Execute document indexing pipeline"""
+        # 1. Detect file type
+        file_type = self._detect_file_type(upload_dto.filename)
+        # 2. Create document entity
+        document = Document(
+            title=self._extract_title(upload_dto.filename),
+            filename=upload_dto.filename,
+            file_type=file_type,
+            file_size=len(upload_dto.content),
+            storage_path=self._generate_storage_path(upload_dto.filename),
+            department=upload_dto.department,
+            metadata=upload_dto.metadata,
+        )
+        # 3. Save document to repository
+        saved_document = await self.document_repository.create(document)
+        # 4. Mark as processing
+        saved_document.mark_as_processing()
+        await self.document_repository.update(saved_document)
+        try:
+            # 5. Extract text content
+            text_content = await self._extract_text(upload_dto.content, file_type)
+            # 6. Chunk the document
+            chunks_data = await self.chunking_service.chunk_text(
+                text=text_content, document_id=saved_document.id, metadata=upload_dto.metadata
+            )
+            # 7. Generate embeddings
+            texts = [chunk.content for chunk in chunks_data]
+            embeddings = await self.embedder.embed_texts(texts)
+            # 8. Store chunks with embeddings
+            # (Vector storage will be handled in infrastructure layer)
+            chunks = await self.chunk_repository.create_bulk(chunks_data)
+            # 9. Mark document as indexed
+            saved_document.mark_as_indexed()
+            await self.document_repository.update(saved_document)
+            # 10. Return DTO
+            return self._to_dto(saved_document)
+        except Exception as e:
+            # Mark as failed
+            saved_document.mark_as_failed()
+            await self.document_repository.update(saved_document)
+            raise
+    def _detect_file_type(self, filename: str) -> DocumentType:
+        """Detect file type from filename"""
+        suffix = Path(filename).suffix.lower()
+        type_map = {
+            ".pdf": DocumentType.PDF,
+            ".docx": DocumentType.DOCX,
+            ".txt": DocumentType.TXT,
+            ".md": DocumentType.MD,
+            ".html": DocumentType.HTML,
+        }
+        return type_map.get(suffix, DocumentType.TXT)
+    def _extract_title(self, filename: str) -> str:
+        """Extract title from filename"""
+        return Path(filename).stem.replace("_", " ").replace("-", " ").title()
+    def _generate_storage_path(self, filename: str) -> str:
+        """Generate unique storage path"""
+        file_hash = hashlib.md5(f"{uuid4()}{filename}".encode()).hexdigest()
+        return f"documents/{file_hash[:2]}/{file_hash}/{filename}"
+    async def _extract_text(self, content: bytes, file_type: DocumentType) -> str:
+        """Extract text from document content"""
+        # Simplified - in production use proper libraries
+        # (PyPDF2, python-docx, BeautifulSoup, etc.)
+        if file_type == DocumentType.TXT or file_type == DocumentType.MD:
+            return content.decode("utf-8")
+        else:
+            # Placeholder - implement proper extraction
+            return content.decode("utf-8", errors="ignore")
+    def _to_dto(self, document: Document) -> DocumentDTO:
+        """Convert Document entity to DTO"""
+        return DocumentDTO(
+            id=str(document.id),
+            title=document.title,
+            filename=document.filename,
+            file_type=document.file_type.value,
+            file_size=document.file_size,
+            department=document.department,
+            status=document.status.value,
+            uploaded_at=document.uploaded_at.isoformat(),
+            indexed_at=document.indexed_at.isoformat() if document.indexed_at else None,
+            metadata=document.metadata,
+        )

app/application/use_cases/query_processing.py ADDED Viewed

	@@ -0,0 +1,136 @@

+"""
+Application Layer - Query Processing Use Case
+Orchestrates the RAG pipeline for answering user queries.
+"""
+import time
+from typing import List
+from app.application.dto import QueryDTO, QueryResponseDTO, SourceDTO
+from app.domain.entities import Query, QueryRequest, Source
+from app.domain.interfaces import ILLM, ICache, IPromptBuilder, IReranker, IRetriever
+class QueryProcessingUseCase:
+    """Use case for processing user queries through RAG pipeline"""
+    def __init__(
+        self,
+        retriever: IRetriever,
+        reranker: IReranker,
+        llm: ILLM,
+        prompt_builder: IPromptBuilder,
+        cache: ICache,
+    ):
+        self.retriever = retriever
+        self.reranker = reranker
+        self.llm = llm
+        self.prompt_builder = prompt_builder
+        self.cache = cache
+    async def execute(self, query_dto: QueryDTO) -> QueryResponseDTO:
+        """Execute query processing pipeline"""
+        start_time = time.time()
+        # 1. Create query request
+        query_request = QueryRequest(
+            query_text=query_dto.query_text,
+            department=query_dto.department,
+            user_id=query_dto.user_id,
+            session_id=query_dto.session_id,
+            top_k=query_dto.top_k,
+            temperature=query_dto.temperature,
+            max_tokens=query_dto.max_tokens,
+            filters=query_dto.filters,
+        )
+        # 2. Check semantic cache
+        cache_key = f"query:{hash(query_dto.query_text)}:{query_dto.department}"
+        cached_response = await self.cache.get(cache_key)
+        if cached_response:
+            return cached_response
+        # 3. Retrieve relevant documents
+        filters = {"department": query_dto.department}
+        if query_dto.filters:
+            filters.update(query_dto.filters)
+        retrieval_results = await self.retriever.hybrid_search(
+            query=query_dto.query_text,
+            top_k=100,  # Initial retrieval
+            alpha=0.5,
+            filters=filters,
+        )
+        # 4. Rerank results
+        reranked_results = await self.reranker.rerank(
+            query=query_dto.query_text, results=retrieval_results, top_k=query_dto.top_k
+        )
+        # 5. Build context
+        context = [result.content for result in reranked_results]
+        # 6. Build prompt
+        messages = self.prompt_builder.build_rag_prompt(
+            query=query_dto.query_text,
+            context=context,
+            system_prompt=self._get_system_prompt(query_dto.department),
+        )
+        # 7. Generate answer
+        llm_response = await self.llm.generate(
+            messages=messages,
+            temperature=query_dto.temperature,
+            max_tokens=query_dto.max_tokens,
+        )
+        # 8. Create sources
+        sources = [
+            SourceDTO(
+                title=f"Document {result.document_id}",
+                content=result.content[:500],  # Truncate for response
+                relevance_score=result.score,
+                document_id=result.document_id,
+                chunk_index=result.chunk_index,
+                metadata=result.metadata,
+            )
+            for result in reranked_results
+        ]
+        # 9. Calculate metrics
+        processing_time_ms = int((time.time() - start_time) * 1000)
+        # 10. Build response
+        response = QueryResponseDTO(
+            query_id=str(query_request.id) if hasattr(query_request, "id") else "temp",
+            answer=llm_response.content,
+            sources=sources,
+            confidence=self._calculate_confidence(reranked_results),
+            processing_time_ms=processing_time_ms,
+            tokens_used=llm_response.tokens_used,
+            model=llm_response.model,
+        )
+        # 11. Cache response
+        await self.cache.set(cache_key, response, ttl=3600)
+        return response
+    def _get_system_prompt(self, department: str) -> str:
+        """Get department-specific system prompt"""
+        prompts = {
+            "HR": "You are a helpful HR assistant for employee onboarding. Provide clear, accurate information about HR policies, benefits, and procedures.",
+            "IT": "You are an IT support assistant for new employees. Help with technical setup, access, and IT policies.",
+            "Legal": "You are a legal compliance assistant. Provide information about legal policies, regulations, and compliance requirements.",
+            "Finance": "You are a finance assistant. Help with expense policies, financial procedures, and budget information.",
+            "General": "You are a helpful corporate onboarding assistant. Provide accurate information to help new employees integrate successfully.",
+        }
+        return prompts.get(department, prompts["General"])
+    def _calculate_confidence(self, results: List) -> float:
+        """Calculate confidence score based on retrieval results"""
+        if not results:
+            return 0.0
+        # Average of top 3 scores
+        top_scores = [r.score for r in results[:3]]
+        return sum(top_scores) / len(top_scores) if top_scores else 0.0

app/core/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Core utilities"""

app/core/config.py ADDED Viewed

	@@ -0,0 +1,121 @@

+"""
+Core - Configuration Management
+Handles application configuration using Pydantic Settings.
+"""
+from functools import lru_cache
+from typing import List
+from pydantic import Field
+from pydantic_settings import BaseSettings, SettingsConfigDict
+class Settings(BaseSettings):
+    """Application settings"""
+    model_config = SettingsConfigDict(env_file=".env", case_sensitive=False)
+    # Application
+    app_name: str = Field(default="RAG Onboarding Backend")
+    app_version: str = Field(default="1.0.0")
+    environment: str = Field(default="development")
+    debug: bool = Field(default=True)
+    log_level: str = Field(default="INFO")
+    # Server
+    host: str = Field(default="0.0.0.0")
+    port: int = Field(default=8000)
+    workers: int = Field(default=4)
+    # Database
+    database_url: str = Field(
+        default="postgresql+asyncpg://postgres:postgres@localhost:5432/rag_onboarding"
+    )
+    database_pool_size: int = Field(default=20)
+    database_max_overflow: int = Field(default=0)
+    # Redis
+    redis_url: str = Field(default="redis://localhost:6379/0")
+    redis_cache_ttl: int = Field(default=3600)
+    # Qdrant
+    qdrant_url: str = Field(default="http://localhost:6333")
+    qdrant_api_key: str = Field(default="")
+    qdrant_collection_name: str = Field(default="onboarding_documents")
+    qdrant_vector_size: int = Field(default=384)
+    # RabbitMQ / Celery
+    rabbitmq_url: str = Field(default="amqp://guest:guest@localhost:5672/")
+    celery_broker_url: str = Field(default="redis://localhost:6379/1")
+    celery_result_backend: str = Field(default="redis://localhost:6379/2")
+    # Gemini
+    gemini_api_key: str = Field(default="")
+    gemini_model: str = Field(default="gemini-2.0-flash")
+    gemini_temperature: float = Field(default=0.7)
+    gemini_max_tokens: int = Field(default=2048)
+    # OpenAI (fallback)
+    openai_api_key: str = Field(default="")
+    openai_model: str = Field(default="gpt-4-turbo-preview")
+    # Embeddings
+    embedding_model: str = Field(default="sentence-transformers/all-MiniLM-L6-v2")
+    embedding_dimension: int = Field(default=384)
+    embedding_batch_size: int = Field(default=32)
+    # RAG Configuration
+    rag_initial_k: int = Field(default=100)
+    rag_final_k: int = Field(default=10)
+    rag_min_score: float = Field(default=0.7)
+    rag_search_type: str = Field(default="hybrid")
+    rag_hybrid_alpha: float = Field(default=0.5)
+    rag_max_context_tokens: int = Field(default=4000)
+    # Reranking
+    rerank_model: str = Field(default="cross-encoder/ms-marco-MiniLM-L-12-v2")
+    use_reranking: bool = Field(default=True)
+    # Caching
+    enable_semantic_cache: bool = Field(default=True)
+    cache_embedding_ttl: int = Field(default=86400)
+    cache_retrieval_ttl: int = Field(default=3600)
+    cache_generation_ttl: int = Field(default=1800)
+    # Circuit Breaker
+    circuit_breaker_failure_threshold: int = Field(default=5)
+    circuit_breaker_recovery_timeout: int = Field(default=60)
+    # Retry Policy
+    retry_max_attempts: int = Field(default=3)
+    retry_wait_exponential_multiplier: int = Field(default=1)
+    retry_wait_exponential_max: int = Field(default=10)
+    # Rate Limiting
+    rate_limit_enabled: bool = Field(default=True)
+    rate_limit_per_minute: int = Field(default=60)
+    rate_limit_per_hour: int = Field(default=1000)
+    # Monitoring
+    prometheus_port: int = Field(default=9090)
+    enable_tracing: bool = Field(default=True)
+    jaeger_agent_host: str = Field(default="localhost")
+    jaeger_agent_port: int = Field(default=6831)
+    trace_sample_rate: float = Field(default=0.1)
+    # CORS
+    cors_origins: List[str] = Field(
+        default=["http://localhost:3000", "http://localhost:8000"]
+    )
+    cors_allow_credentials: bool = Field(default=True)
+    # Security
+    secret_key: str = Field(default="your-secret-key-change-in-production")
+    algorithm: str = Field(default="HS256")
+    access_token_expire_minutes: int = Field(default=30)
+@lru_cache
+def get_settings() -> Settings:
+    """Get cached settings instance"""
+    return Settings()

app/core/logging.py ADDED Viewed

	@@ -0,0 +1,75 @@

+"""
+Core - Structured Logging
+JSON structured logging with correlation IDs.
+"""
+import logging
+import sys
+from contextvars import ContextVar
+from datetime import datetime
+from typing import Any, Dict
+import structlog
+# Correlation ID context variable
+correlation_id_var: ContextVar[str] = ContextVar("correlation_id", default="")
+def get_correlation_id() -> str:
+    """Get current correlation ID"""
+    return correlation_id_var.get()
+def set_correlation_id(correlation_id: str) -> None:
+    """Set correlation ID for current context"""
+    correlation_id_var.set(correlation_id)
+def add_correlation_id(logger: Any, method_name: str, event_dict: Dict) -> Dict:
+    """Add correlation ID to log event"""
+    event_dict["correlation_id"] = get_correlation_id()
+    return event_dict
+def add_service_info(logger: Any, method_name: str, event_dict: Dict) -> Dict:
+    """Add service information to log event"""
+    event_dict["service"] = "rag-onboarding-backend"
+    event_dict["version"] = "1.0.0"
+    return event_dict
+def setup_logging(log_level: str = "INFO") -> None:
+    """Setup structured logging"""
+    # Configure structlog
+    structlog.configure(
+        processors=[
+            structlog.contextvars.merge_contextvars,
+            structlog.stdlib.filter_by_level,
+            structlog.processors.TimeStamper(fmt="iso"),
+            structlog.stdlib.add_logger_name,
+            structlog.stdlib.add_log_level,
+            structlog.processors.StackInfoRenderer(),
+            add_correlation_id,
+            add_service_info,
+            structlog.processors.format_exc_info,
+            structlog.processors.UnicodeDecoder(),
+            structlog.processors.JSONRenderer(),
+        ],
+        wrapper_class=structlog.stdlib.BoundLogger,
+        context_class=dict,
+        logger_factory=structlog.stdlib.LoggerFactory(),
+        cache_logger_on_first_use=True,
+    )
+    # Configure standard logging
+    logging.basicConfig(
+        format="%(message)s",
+        stream=sys.stdout,
+        level=getattr(logging, log_level.upper()),
+    )
+def get_logger(name: str) -> structlog.stdlib.BoundLogger:
+    """Get logger instance"""
+    return structlog.get_logger(name)

app/core/metrics.py ADDED Viewed

	@@ -0,0 +1,98 @@

+"""
+Core - Prometheus Metrics
+Application metrics for monitoring.
+"""
+from prometheus_client import Counter, Gauge, Histogram
+# Request metrics
+http_requests_total = Counter(
+    "http_requests_total",
+    "Total HTTP requests",
+    ["method", "endpoint", "status"],
+)
+http_request_duration_seconds = Histogram(
+    "http_request_duration_seconds",
+    "HTTP request duration in seconds",
+    ["method", "endpoint"],
+    buckets=[0.01, 0.05, 0.1, 0.5, 1.0, 2.0, 5.0, 10.0],
+)
+# RAG Pipeline metrics
+rag_retrieval_duration_seconds = Histogram(
+    "rag_retrieval_duration_seconds",
+    "RAG retrieval phase duration",
+    ["strategy"],
+    buckets=[0.01, 0.05, 0.1, 0.2, 0.5, 1.0, 2.0],
+)
+rag_reranking_duration_seconds = Histogram(
+    "rag_reranking_duration_seconds",
+    "RAG reranking phase duration",
+    buckets=[0.01, 0.05, 0.1, 0.2, 0.5, 1.0],
+)
+llm_generation_duration_seconds = Histogram(
+    "llm_generation_duration_seconds",
+    "LLM generation duration",
+    ["model"],
+    buckets=[0.5, 1.0, 2.0, 5.0, 10.0, 20.0],
+)
+llm_tokens_used_total = Counter(
+    "llm_tokens_used_total",
+    "Total LLM tokens used",
+    ["model", "type"],  # type: prompt, completion
+)
+# Cache metrics
+cache_hits_total = Counter(
+    "cache_hits_total",
+    "Total cache hits",
+    ["cache_type"],  # embedding, retrieval, generation
+)
+cache_misses_total = Counter(
+    "cache_misses_total",
+    "Total cache misses",
+    ["cache_type"],
+)
+# Business metrics
+queries_total = Counter(
+    "queries_total",
+    "Total queries processed",
+    ["department", "status"],
+)
+documents_indexed_total = Counter(
+    "documents_indexed_total",
+    "Total documents indexed",
+    ["department", "file_type"],
+)
+query_confidence_score = Histogram(
+    "query_confidence_score",
+    "Query confidence scores",
+    ["department"],
+    buckets=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
+)
+# System metrics
+active_requests = Gauge(
+    "active_requests",
+    "Number of active requests",
+)
+database_connections_active = Gauge(
+    "database_connections_active",
+    "Active database connections",
+)
+# Error metrics
+errors_total = Counter(
+    "errors_total",
+    "Total errors",
+    ["error_type", "component"],
+)

app/domain/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Domain entities"""

app/domain/entities/__init__.py ADDED Viewed

	@@ -0,0 +1,15 @@

+"""Domain entities"""
+from app.domain.entities.document import Document, DocumentChunk, DocumentStatus, DocumentType
+from app.domain.entities.query import Query, QueryRequest, QueryStatus, Source
+__all__ = [
+    "Document",
+    "DocumentChunk",
+    "DocumentStatus",
+    "DocumentType",
+    "Query",
+    "QueryRequest",
+    "QueryStatus",
+    "Source",
+]

app/domain/entities/document.py ADDED Viewed

	@@ -0,0 +1,87 @@

+"""
+Domain Layer - Document Entity
+Represents a document in the knowledge base.
+"""
+from dataclasses import dataclass, field
+from datetime import datetime
+from enum import Enum
+from typing import Optional
+from uuid import UUID, uuid4
+class DocumentStatus(str, Enum):
+    """Document processing status"""
+    PENDING = "pending"
+    PROCESSING = "processing"
+    INDEXED = "indexed"
+    FAILED = "failed"
+class DocumentType(str, Enum):
+    """Supported document types"""
+    PDF = "pdf"
+    DOCX = "docx"
+    TXT = "txt"
+    MD = "md"
+    HTML = "html"
+@dataclass
+class Document:
+    """Document entity - core business object"""
+    title: str
+    filename: str
+    file_type: DocumentType
+    file_size: int
+    storage_path: str
+    department: str
+    id: UUID = field(default_factory=uuid4)
+    status: DocumentStatus = DocumentStatus.PENDING
+    upload_session_id: Optional[str] = None
+    uploaded_at: datetime = field(default_factory=datetime.utcnow)
+    indexed_at: Optional[datetime] = None
+    metadata: dict = field(default_factory=dict)
+    created_at: datetime = field(default_factory=datetime.utcnow)
+    updated_at: datetime = field(default_factory=datetime.utcnow)
+    def mark_as_processing(self) -> None:
+        """Mark document as being processed"""
+        self.status = DocumentStatus.PROCESSING
+        self.updated_at = datetime.utcnow()
+    def mark_as_indexed(self) -> None:
+        """Mark document as successfully indexed"""
+        self.status = DocumentStatus.INDEXED
+        self.indexed_at = datetime.utcnow()
+        self.updated_at = datetime.utcnow()
+    def mark_as_failed(self) -> None:
+        """Mark document processing as failed"""
+        self.status = DocumentStatus.FAILED
+        self.updated_at = datetime.utcnow()
+    def is_indexed(self) -> bool:
+        """Check if document is indexed"""
+        return self.status == DocumentStatus.INDEXED
+@dataclass
+class DocumentChunk:
+    """Document chunk - piece of document for vector search"""
+    document_id: UUID
+    chunk_index: int
+    content: str
+    token_count: int
+    id: UUID = field(default_factory=uuid4)
+    vector_id: Optional[str] = None
+    metadata: dict = field(default_factory=dict)
+    created_at: datetime = field(default_factory=datetime.utcnow)
+    def set_vector_id(self, vector_id: str) -> None:
+        """Set Qdrant vector ID"""
+        self.vector_id = vector_id

app/domain/entities/query.py ADDED Viewed

	@@ -0,0 +1,113 @@

+"""
+Domain Layer - Query Entity
+Represents a user query and its result.
+"""
+from dataclasses import dataclass, field
+from datetime import datetime
+from enum import Enum
+from typing import List, Optional
+from uuid import UUID, uuid4
+class QueryStatus(str, Enum):
+    """Query processing status"""
+    PENDING = "pending"
+    PROCESSING = "processing"
+    COMPLETED = "completed"
+    FAILED = "failed"
+@dataclass
+class Source:
+    """Retrieved source/citation for a query answer"""
+    title: str
+    content: str
+    relevance_score: float
+    document_id: UUID
+    chunk_index: int
+    metadata: dict = field(default_factory=dict)
+    def to_dict(self) -> dict:
+        """Convert to dictionary"""
+        return {
+            "title": self.title,
+            "content": self.content,
+            "relevance_score": self.relevance_score,
+            "document_id": str(self.document_id),
+            "chunk_index": self.chunk_index,
+            "metadata": self.metadata,
+        }
+@dataclass
+class Query:
+    """Query entity - represents user question"""
+    query_text: str
+    department: str
+    user_id: Optional[str] = None
+    session_id: Optional[str] = None
+    id: UUID = field(default_factory=uuid4)
+    status: QueryStatus = QueryStatus.PENDING
+    answer: Optional[str] = None
+    sources: List[Source] = field(default_factory=list)
+    confidence: float = 0.0
+    duration_ms: int = 0
+    tokens_used: int = 0
+    model: Optional[str] = None
+    created_at: datetime = field(default_factory=datetime.utcnow)
+    completed_at: Optional[datetime] = None
+    def mark_as_processing(self) -> None:
+        """Mark query as being processed"""
+        self.status = QueryStatus.PROCESSING
+    def mark_as_completed(
+        self,
+        answer: str,
+        sources: List[Source],
+        confidence: float,
+        duration_ms: int,
+        tokens_used: int,
+        model: str,
+    ) -> None:
+        """Mark query as completed with results"""
+        self.status = QueryStatus.COMPLETED
+        self.answer = answer
+        self.sources = sources
+        self.confidence = confidence
+        self.duration_ms = duration_ms
+        self.tokens_used = tokens_used
+        self.model = model
+        self.completed_at = datetime.utcnow()
+    def mark_as_failed(self) -> None:
+        """Mark query as failed"""
+        self.status = QueryStatus.FAILED
+        self.completed_at = datetime.utcnow()
+@dataclass
+class QueryRequest:
+    """Query request from user - value object"""
+    query_text: str
+    department: str
+    user_id: Optional[str] = None
+    session_id: Optional[str] = None
+    top_k: int = 10
+    temperature: float = 0.7
+    max_tokens: int = 2048
+    filters: dict = field(default_factory=dict)
+    def __post_init__(self) -> None:
+        """Validate query request"""
+        if not self.query_text or len(self.query_text.strip()) == 0:
+            raise ValueError("query_text cannot be empty")
+        if self.top_k < 1 or self.top_k > 50:
+            raise ValueError("top_k must be between 1 and 50")
+        if self.temperature < 0 or self.temperature > 1:
+            raise ValueError("temperature must be between 0 and 1")

app/domain/interfaces/__init__.py ADDED Viewed

	@@ -0,0 +1,20 @@

+"""Domain interfaces"""
+from app.domain.interfaces.cache import ICache
+from app.domain.interfaces.llm import ILLM, IPromptBuilder, LLMMessage, LLMResponse
+from app.domain.interfaces.repository import IChunkRepository, IDocumentRepository
+from app.domain.interfaces.retriever import IEmbedder, IReranker, IRetriever, RetrievalResult
+__all__ = [
+    "ICache",
+    "IChunkRepository",
+    "IDocumentRepository",
+    "IEmbedder",
+    "ILLM",
+    "IPromptBuilder",
+    "IReranker",
+    "IRetriever",
+    "LLMMessage",
+    "LLMResponse",
+    "RetrievalResult",
+]

app/domain/interfaces/cache.py ADDED Viewed

	@@ -0,0 +1,36 @@

+"""
+Domain Layer - Cache Interface
+Defines contract for caching implementations.
+"""
+from abc import ABC, abstractmethod
+from typing import Any, Optional
+class ICache(ABC):
+    """Interface for cache implementations"""
+    @abstractmethod
+    async def get(self, key: str) -> Optional[Any]:
+        """Get value from cache"""
+        pass
+    @abstractmethod
+    async def set(self, key: str, value: Any, ttl: Optional[int] = None) -> bool:
+        """Set value in cache"""
+        pass
+    @abstractmethod
+    async def delete(self, key: str) -> bool:
+        """Delete key from cache"""
+        pass
+    @abstractmethod
+    async def exists(self, key: str) -> bool:
+        """Check if key exists"""
+        pass
+    @abstractmethod
+    async def clear(self, pattern: str = "*") -> int:
+        """Clear cache by pattern"""
+        pass

app/domain/interfaces/llm.py ADDED Viewed

	@@ -0,0 +1,72 @@

+"""
+Domain Layer - LLM Interface
+Defines contract for LLM implementations.
+"""
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+from typing import AsyncIterator, List, Optional
+@dataclass
+class LLMMessage:
+    """Chat message"""
+    role: str  # system, user, assistant
+    content: str
+@dataclass
+class LLMResponse:
+    """LLM generation response"""
+    content: str
+    model: str
+    tokens_used: int
+    finish_reason: str
+class ILLM(ABC):
+    """Interface for LLM implementations"""
+    @abstractmethod
+    async def generate(
+        self,
+        messages: List[LLMMessage],
+        temperature: float = 0.7,
+        max_tokens: int = 2048,
+        stream: bool = False,
+    ) -> LLMResponse:
+        """Generate response from LLM"""
+        pass
+    @abstractmethod
+    async def generate_stream(
+        self,
+        messages: List[LLMMessage],
+        temperature: float = 0.7,
+        max_tokens: int = 2048,
+    ) -> AsyncIterator[str]:
+        """Generate streaming response from LLM"""
+        pass
+    @abstractmethod
+    def get_model_name(self) -> str:
+        """Get model name"""
+        pass
+class IPromptBuilder(ABC):
+    """Interface for prompt building"""
+    @abstractmethod
+    def build_rag_prompt(
+        self, query: str, context: List[str], system_prompt: Optional[str] = None
+    ) -> List[LLMMessage]:
+        """Build RAG prompt with query and context"""
+        pass
+    @abstractmethod
+    def build_query_expansion_prompt(self, query: str) -> List[LLMMessage]:
+        """Build prompt for query expansion"""
+        pass

app/domain/interfaces/repository.py ADDED Viewed

	@@ -0,0 +1,60 @@

+"""
+Domain Layer - Interfaces (Ports)
+Defines contracts for infrastructure implementations.
+"""
+from abc import ABC, abstractmethod
+from typing import List, Optional
+from uuid import UUID
+from app.domain.entities import Document, DocumentChunk
+class IDocumentRepository(ABC):
+    """Repository interface for document persistence"""
+    @abstractmethod
+    async def create(self, document: Document) -> Document:
+        """Create new document"""
+        pass
+    @abstractmethod
+    async def get_by_id(self, document_id: UUID) -> Optional[Document]:
+        """Get document by ID"""
+        pass
+    @abstractmethod
+    async def update(self, document: Document) -> Document:
+        """Update document"""
+        pass
+    @abstractmethod
+    async def list_by_department(
+        self, department: str, skip: int = 0, limit: int = 100
+    ) -> List[Document]:
+        """List documents by department"""
+        pass
+    @abstractmethod
+    async def delete(self, document_id: UUID) -> bool:
+        """Delete document"""
+        pass
+class IChunkRepository(ABC):
+    """Repository interface for document chunks"""
+    @abstractmethod
+    async def create_bulk(self, chunks: List[DocumentChunk]) -> List[DocumentChunk]:
+        """Create multiple chunks"""
+        pass
+    @abstractmethod
+    async def get_by_document_id(self, document_id: UUID) -> List[DocumentChunk]:
+        """Get all chunks for a document"""
+        pass
+    @abstractmethod
+    async def delete_by_document_id(self, document_id: UUID) -> int:
+        """Delete all chunks for a document"""
+        pass

app/domain/interfaces/retriever.py ADDED Viewed

	@@ -0,0 +1,75 @@

+"""
+Domain Layer - Retriever Interface
+Defines contract for document retrieval implementations.
+"""
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+from typing import List, Optional
+@dataclass
+class RetrievalResult:
+    """Single retrieval result"""
+    content: str
+    score: float
+    document_id: str
+    chunk_index: int
+    metadata: dict
+class IRetriever(ABC):
+    """Interface for document retrieval"""
+    @abstractmethod
+    async def search(
+        self,
+        query: str,
+        top_k: int = 10,
+        filters: Optional[dict] = None,
+        min_score: float = 0.0,
+    ) -> List[RetrievalResult]:
+        """Search for relevant documents"""
+        pass
+    @abstractmethod
+    async def hybrid_search(
+        self,
+        query: str,
+        top_k: int = 10,
+        alpha: float = 0.5,
+        filters: Optional[dict] = None,
+    ) -> List[RetrievalResult]:
+        """Hybrid search (semantic + keyword)"""
+        pass
+class IReranker(ABC):
+    """Interface for result reranking"""
+    @abstractmethod
+    async def rerank(
+        self, query: str, results: List[RetrievalResult], top_k: int = 10
+    ) -> List[RetrievalResult]:
+        """Rerank retrieval results"""
+        pass
+class IEmbedder(ABC):
+    """Interface for text embedding"""
+    @abstractmethod
+    async def embed_text(self, text: str) -> List[float]:
+        """Generate embedding for single text"""
+        pass
+    @abstractmethod
+    async def embed_texts(self, texts: List[str]) -> List[List[float]]:
+        """Generate embeddings for multiple texts"""
+        pass
+    @abstractmethod
+    def get_dimension(self) -> int:
+        """Get embedding dimension"""
+        pass

app/infrastructure/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Infrastructure layer"""

app/infrastructure/cache/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Cache implementations"""

app/infrastructure/cache/redis_cache.py ADDED Viewed

	@@ -0,0 +1,84 @@

+"""
+Infrastructure - Redis Cache Implementation
+"""
+import json
+from typing import Any, Optional
+import redis.asyncio as redis
+from app.domain.interfaces import ICache
+class RedisCache(ICache):
+    """Redis cache implementation"""
+    def __init__(self, redis_url: str):
+        self.redis_url = redis_url
+        self._client: Optional[redis.Redis] = None
+    async def _get_client(self) -> redis.Redis:
+        """Get or create Redis client"""
+        if self._client is None:
+            self._client = await redis.from_url(
+                self.redis_url, encoding="utf-8", decode_responses=True
+            )
+        return self._client
+    async def get(self, key: str) -> Optional[Any]:
+        """Get value from cache"""
+        client = await self._get_client()
+        value = await client.get(key)
+        if value is None:
+            return None
+        try:
+            return json.loads(value)
+        except json.JSONDecodeError:
+            return value
+    async def set(self, key: str, value: Any, ttl: Optional[int] = None) -> bool:
+        """Set value in cache"""
+        client = await self._get_client()
+        # Serialize value
+        if isinstance(value, (dict, list)):
+            serialized = json.dumps(value)
+        else:
+            serialized = str(value)
+        if ttl:
+            await client.setex(key, ttl, serialized)
+        else:
+            await client.set(key, serialized)
+        return True
+    async def delete(self, key: str) -> bool:
+        """Delete key from cache"""
+        client = await self._get_client()
+        result = await client.delete(key)
+        return result > 0
+    async def exists(self, key: str) -> bool:
+        """Check if key exists"""
+        client = await self._get_client()
+        return await client.exists(key) > 0
+    async def clear(self, pattern: str = "*") -> int:
+        """Clear cache by pattern"""
+        client = await self._get_client()
+        keys = []
+        async for key in client.scan_iter(match=pattern):
+            keys.append(key)
+        if keys:
+            return await client.delete(*keys)
+        return 0
+    async def close(self) -> None:
+        """Close Redis connection"""
+        if self._client:
+            await self._client.close()

app/infrastructure/database/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Database infrastructure"""

app/infrastructure/database/models.py ADDED Viewed

	@@ -0,0 +1,82 @@

+"""
+Infrastructure - Database Models (SQLAlchemy)
+"""
+import uuid
+from datetime import datetime
+from sqlalchemy import Column, DateTime, Integer, String, Text, BigInteger, Index
+from sqlalchemy.dialects.postgresql import JSONB, UUID
+from sqlalchemy.ext.declarative import declarative_base
+Base = declarative_base()
+class DocumentModel(Base):
+    """Document table model"""
+    __tablename__ = "documents"
+    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
+    title = Column(String(500), nullable=False, index=True)
+    filename = Column(String(255), nullable=False)
+    file_type = Column(String(50), nullable=False, index=True)
+    file_size = Column(BigInteger, nullable=False)
+    storage_path = Column(String(1000), nullable=False)
+    department = Column(String(100), nullable=False, index=True)
+    status = Column(String(50), nullable=False, default="pending", index=True)
+    upload_session_id = Column(String(100), nullable=True)
+    uploaded_at = Column(DateTime(timezone=True), default=datetime.utcnow, nullable=False)
+    indexed_at = Column(DateTime(timezone=True), nullable=True)
+    metadata = Column(JSONB, default={}, nullable=False)
+    created_at = Column(DateTime(timezone=True), default=datetime.utcnow, nullable=False)
+    updated_at = Column(
+        DateTime(timezone=True), default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False
+    )
+    __table_args__ = (
+        Index("ix_documents_department_status", "department", "status"),
+        Index("ix_documents_created_at", "created_at"),
+    )
+class DocumentChunkModel(Base):
+    """Document chunk table model"""
+    __tablename__ = "document_chunks"
+    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
+    document_id = Column(UUID(as_uuid=True), nullable=False, index=True)
+    chunk_index = Column(Integer, nullable=False)
+    content = Column(Text, nullable=False)
+    token_count = Column(Integer, nullable=False)
+    vector_id = Column(String(100), nullable=True, index=True)
+    metadata = Column(JSONB, default={}, nullable=False)
+    created_at = Column(DateTime(timezone=True), default=datetime.utcnow, nullable=False)
+    __table_args__ = (Index("ix_chunks_document_id_index", "document_id", "chunk_index"),)
+class QueryModel(Base):
+    """Query table model"""
+    __tablename__ = "queries"
+    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
+    query_text = Column(Text, nullable=False)
+    department = Column(String(100), nullable=False, index=True)
+    user_id = Column(String(100), nullable=True, index=True)
+    session_id = Column(String(100), nullable=True, index=True)
+    status = Column(String(50), nullable=False, default="pending", index=True)
+    answer = Column(Text, nullable=True)
+    sources = Column(JSONB, default=[], nullable=False)
+    confidence = Column(Integer, default=0, nullable=False)  # Store as int (0-100)
+    duration_ms = Column(Integer, default=0, nullable=False)
+    tokens_used = Column(Integer, default=0, nullable=False)
+    model = Column(String(100), nullable=True)
+    created_at = Column(DateTime(timezone=True), default=datetime.utcnow, nullable=False, index=True)
+    completed_at = Column(DateTime(timezone=True), nullable=True)
+    __table_args__ = (
+        Index("ix_queries_department_created", "department", "created_at"),
+        Index("ix_queries_user_created", "user_id", "created_at"),
+    )

app/infrastructure/external/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """External services"""

app/infrastructure/external/embedder.py ADDED Viewed

	@@ -0,0 +1,31 @@

+"""
+Infrastructure - Sentence Transformers Embedding Service
+"""
+from typing import List
+from sentence_transformers import SentenceTransformer
+from app.domain.interfaces import IEmbedder
+class SentenceTransformerEmbedder(IEmbedder):
+    """Sentence Transformers embedding implementation"""
+    def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
+        self.model_name = model_name
+        self.model = SentenceTransformer(model_name)
+        self.dimension = self.model.get_sentence_embedding_dimension()
+    async def embed_text(self, text: str) -> List[float]:
+        """Generate embedding for single text"""
+        embedding = self.model.encode(text, convert_to_numpy=True)
+        return embedding.tolist()
+    async def embed_texts(self, texts: List[str]) -> List[List[float]]:
+        """Generate embeddings for multiple texts"""
+        embeddings = self.model.encode(texts, convert_to_numpy=True, show_progress_bar=False)
+        return embeddings.tolist()
+    def get_dimension(self) -> int:
+        """Get embedding dimension"""
+        return self.dimension

app/infrastructure/external/gemini_llm.py ADDED Viewed

	@@ -0,0 +1,87 @@

+"""
+Infrastructure - Gemini LLM Service
+"""
+from typing import AsyncIterator, List
+import google.generativeai as genai
+from app.domain.interfaces import ILLM, LLMMessage, LLMResponse
+class GeminiLLM(ILLM):
+    """Gemini LLM implementation"""
+    def __init__(self, api_key: str, model_name: str = "gemini-2.0-flash"):
+        genai.configure(api_key=api_key)
+        self.model_name = model_name
+        self.model = genai.GenerativeModel(model_name)
+    async def generate(
+        self,
+        messages: List[LLMMessage],
+        temperature: float = 0.7,
+        max_tokens: int = 2048,
+        stream: bool = False,
+    ) -> LLMResponse:
+        """Generate response from Gemini"""
+        # Convert messages to Gemini format
+        prompt = self._build_prompt(messages)
+        # Generate
+        response = await self.model.generate_content_async(
+            prompt,
+            generation_config=genai.types.GenerationConfig(
+                temperature=temperature, max_output_tokens=max_tokens
+            ),
+        )
+        # Count tokens (approximate)
+        tokens_used = len(prompt.split()) + len(response.text.split())
+        return LLMResponse(
+            content=response.text,
+            model=self.model_name,
+            tokens_used=tokens_used,
+            finish_reason="stop",
+        )
+    async def generate_stream(
+        self,
+        messages: List[LLMMessage],
+        temperature: float = 0.7,
+        max_tokens: int = 2048,
+    ) -> AsyncIterator[str]:
+        """Generate streaming response from Gemini"""
+        prompt = self._build_prompt(messages)
+        response = await self.model.generate_content_async(
+            prompt,
+            generation_config=genai.types.GenerationConfig(
+                temperature=temperature, max_output_tokens=max_tokens
+            ),
+            stream=True,
+        )
+        async for chunk in response:
+            if chunk.text:
+                yield chunk.text
+    def get_model_name(self) -> str:
+        """Get model name"""
+        return self.model_name
+    def _build_prompt(self, messages: List[LLMMessage]) -> str:
+        """Build prompt from messages"""
+        parts = []
+        for msg in messages:
+            if msg.role == "system":
+                parts.append(f"System: {msg.content}")
+            elif msg.role == "user":
+                parts.append(f"User: {msg.content}")
+            elif msg.role == "assistant":
+                parts.append(f"Assistant: {msg.content}")
+        return "\n\n".join(parts)

app/infrastructure/external/prompt_builder.py ADDED Viewed

	@@ -0,0 +1,54 @@

+"""
+Infrastructure - Prompt Builder Implementation
+"""
+from typing import List, Optional
+from app.domain.interfaces import IPromptBuilder, LLMMessage
+class DefaultPromptBuilder(IPromptBuilder):
+    """Default prompt builder implementation"""
+    def build_rag_prompt(
+        self, query: str, context: List[str], system_prompt: Optional[str] = None
+    ) -> List[LLMMessage]:
+        """Build RAG prompt with query and context"""
+        if system_prompt is None:
+            system_prompt = """You are a helpful corporate onboarding assistant.
+Answer questions based ONLY on the provided context.
+If the answer is not in the context, say "I don't have enough information to answer that question."
+Always cite your sources by referencing the relevant context sections."""
+        # Build context string
+        context_str = "\n\n---\n\n".join(
+            [f"[Context {i+1}]\n{ctx}" for i, ctx in enumerate(context)]
+        )
+        user_message = f"""Context:
+{context_str}
+Question: {query}
+Please provide a clear, accurate answer based on the context above. Include citations."""
+        return [
+            LLMMessage(role="system", content=system_prompt),
+            LLMMessage(role="user", content=user_message),
+        ]
+    def build_query_expansion_prompt(self, query: str) -> List[LLMMessage]:
+        """Build prompt for query expansion"""
+        system_prompt = """You are a query expansion expert.
+Generate 2-3 alternative phrasings of the user's question to improve retrieval.
+Return only the alternative questions, one per line."""
+        user_message = f"""Original question: {query}
+Generate alternative phrasings:"""
+        return [
+            LLMMessage(role="system", content=system_prompt),
+            LLMMessage(role="user", content=user_message),
+        ]

app/infrastructure/external/qdrant_retriever.py ADDED Viewed

	@@ -0,0 +1,124 @@

+"""
+Infrastructure - Qdrant Vector Store
+"""
+from typing import Dict, List, Optional
+from uuid import UUID
+from qdrant_client import AsyncQdrantClient
+from qdrant_client.models import Distance, PointStruct, VectorParams, Filter, FieldCondition, MatchValue
+from app.domain.interfaces import IRetriever, RetrievalResult
+class QdrantRetriever(IRetriever):
+    """Qdrant vector store implementation"""
+    def __init__(
+        self,
+        url: str,
+        collection_name: str,
+        vector_size: int = 384,
+        api_key: Optional[str] = None,
+    ):
+        self.url = url
+        self.collection_name = collection_name
+        self.vector_size = vector_size
+        self.client = AsyncQdrantClient(url=url, api_key=api_key)
+    async def initialize_collection(self) -> None:
+        """Initialize Qdrant collection"""
+        collections = await self.client.get_collections()
+        collection_names = [c.name for c in collections.collections]
+        if self.collection_name not in collection_names:
+            await self.client.create_collection(
+                collection_name=self.collection_name,
+                vectors_config=VectorParams(size=self.vector_size, distance=Distance.COSINE),
+            )
+    async def search(
+        self,
+        query: str,
+        top_k: int = 10,
+        filters: Optional[dict] = None,
+        min_score: float = 0.0,
+    ) -> List[RetrievalResult]:
+        """Search for relevant documents"""
+        # Note: This requires embedding the query first
+        # In practice, this would be called with query_vector
+        raise NotImplementedError("Use search_by_vector instead")
+    async def search_by_vector(
+        self,
+        query_vector: List[float],
+        top_k: int = 10,
+        filters: Optional[dict] = None,
+        min_score: float = 0.0,
+    ) -> List[RetrievalResult]:
+        """Search by pre-computed vector"""
+        # Build filter
+        qdrant_filter = None
+        if filters:
+            conditions = []
+            for key, value in filters.items():
+                conditions.append(
+                    FieldCondition(key=key, match=MatchValue(value=value))
+                )
+            if conditions:
+                qdrant_filter = Filter(must=conditions)
+        # Search
+        search_result = await self.client.search(
+            collection_name=self.collection_name,
+            query_vector=query_vector,
+            limit=top_k,
+            query_filter=qdrant_filter,
+            score_threshold=min_score,
+        )
+        # Convert to RetrievalResult
+        results = []
+        for hit in search_result:
+            payload = hit.payload or {}
+            results.append(
+                RetrievalResult(
+                    content=payload.get("content", ""),
+                    score=hit.score,
+                    document_id=payload.get("document_id", ""),
+                    chunk_index=payload.get("chunk_index", 0),
+                    metadata=payload.get("metadata", {}),
+                )
+            )
+        return results
+    async def hybrid_search(
+        self,
+        query: str,
+        top_k: int = 10,
+        alpha: float = 0.5,
+        filters: Optional[dict] = None,
+    ) -> List[RetrievalResult]:
+        """Hybrid search (semantic + keyword)"""
+        # Simplified - in production combine with keyword search
+        raise NotImplementedError("Hybrid search requires integration with keyword search")
+    async def upsert_points(
+        self,
+        points: List[Dict],
+    ) -> None:
+        """Upsert points to collection"""
+        qdrant_points = [
+            PointStruct(
+                id=point["id"],
+                vector=point["vector"],
+                payload=point["payload"],
+            )
+            for point in points
+        ]
+        await self.client.upsert(
+            collection_name=self.collection_name,
+            points=qdrant_points,
+        )

app/infrastructure/external/simple_reranker.py ADDED Viewed

	@@ -0,0 +1,20 @@

+"""
+Infrastructure - Simple Reranker (Placeholder)
+In production, use cross-encoder reranker.
+"""
+from typing import List
+from app.domain.interfaces import IReranker, RetrievalResult
+class SimpleReranker(IReranker):
+    """Simple reranker - just returns top-k by score"""
+    async def rerank(
+        self, query: str, results: List[RetrievalResult], top_k: int = 10
+    ) -> List[RetrievalResult]:
+        """Rerank results (simplified - just sort by score)"""
+        # Sort by score descending
+        sorted_results = sorted(results, key=lambda x: x.score, reverse=True)
+        return sorted_results[:top_k]

app/infrastructure/repositories/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Repositories"""

app/infrastructure/repositories/postgres_repository.py ADDED Viewed

	@@ -0,0 +1,178 @@

+"""
+Infrastructure - PostgreSQL Repository Implementation
+"""
+from typing import List, Optional
+from uuid import UUID
+from sqlalchemy import select
+from sqlalchemy.ext.asyncio import AsyncSession
+from app.domain.entities import Document, DocumentChunk, DocumentStatus, DocumentType
+from app.domain.interfaces import IChunkRepository, IDocumentRepository
+from app.infrastructure.database.models import DocumentChunkModel, DocumentModel
+class PostgresDocumentRepository(IDocumentRepository):
+    """PostgreSQL implementation of document repository"""
+    def __init__(self, session: AsyncSession):
+        self.session = session
+    async def create(self, document: Document) -> Document:
+        """Create new document"""
+        model = DocumentModel(
+            id=document.id,
+            title=document.title,
+            filename=document.filename,
+            file_type=document.file_type.value,
+            file_size=document.file_size,
+            storage_path=document.storage_path,
+            department=document.department,
+            status=document.status.value,
+            upload_session_id=document.upload_session_id,
+            uploaded_at=document.uploaded_at,
+            indexed_at=document.indexed_at,
+            metadata=document.metadata,
+        )
+        self.session.add(model)
+        await self.session.commit()
+        await self.session.refresh(model)
+        return self._to_entity(model)
+    async def get_by_id(self, document_id: UUID) -> Optional[Document]:
+        """Get document by ID"""
+        result = await self.session.execute(
+            select(DocumentModel).where(DocumentModel.id == document_id)
+        )
+        model = result.scalar_one_or_none()
+        return self._to_entity(model) if model else None
+    async def update(self, document: Document) -> Document:
+        """Update document"""
+        result = await self.session.execute(
+            select(DocumentModel).where(DocumentModel.id == document.id)
+        )
+        model = result.scalar_one_or_none()
+        if not model:
+            raise ValueError(f"Document {document.id} not found")
+        model.title = document.title
+        model.status = document.status.value
+        model.indexed_at = document.indexed_at
+        model.metadata = document.metadata
+        model.updated_at = document.updated_at
+        await self.session.commit()
+        await self.session.refresh(model)
+        return self._to_entity(model)
+    async def list_by_department(
+        self, department: str, skip: int = 0, limit: int = 100
+    ) -> List[Document]:
+        """List documents by department"""
+        result = await self.session.execute(
+            select(DocumentModel)
+            .where(DocumentModel.department == department)
+            .offset(skip)
+            .limit(limit)
+            .order_by(DocumentModel.created_at.desc())
+        )
+        models = result.scalars().all()
+        return [self._to_entity(model) for model in models]
+    async def delete(self, document_id: UUID) -> bool:
+        """Delete document"""
+        result = await self.session.execute(
+            select(DocumentModel).where(DocumentModel.id == document_id)
+        )
+        model = result.scalar_one_or_none()
+        if not model:
+            return False
+        await self.session.delete(model)
+        await self.session.commit()
+        return True
+    def _to_entity(self, model: DocumentModel) -> Document:
+        """Convert model to entity"""
+        return Document(
+            id=model.id,
+            title=model.title,
+            filename=model.filename,
+            file_type=DocumentType(model.file_type),
+            file_size=model.file_size,
+            storage_path=model.storage_path,
+            department=model.department,
+            status=DocumentStatus(model.status),
+            upload_session_id=model.upload_session_id,
+            uploaded_at=model.uploaded_at,
+            indexed_at=model.indexed_at,
+            metadata=model.metadata,
+            created_at=model.created_at,
+            updated_at=model.updated_at,
+        )
+class PostgresChunkRepository(IChunkRepository):
+    """PostgreSQL implementation of chunk repository"""
+    def __init__(self, session: AsyncSession):
+        self.session = session
+    async def create_bulk(self, chunks: List[DocumentChunk]) -> List[DocumentChunk]:
+        """Create multiple chunks"""
+        models = [
+            DocumentChunkModel(
+                id=chunk.id,
+                document_id=chunk.document_id,
+                chunk_index=chunk.chunk_index,
+                content=chunk.content,
+                token_count=chunk.token_count,
+                vector_id=chunk.vector_id,
+                metadata=chunk.metadata,
+            )
+            for chunk in chunks
+        ]
+        self.session.add_all(models)
+        await self.session.commit()
+        return chunks
+    async def get_by_document_id(self, document_id: UUID) -> List[DocumentChunk]:
+        """Get all chunks for a document"""
+        result = await self.session.execute(
+            select(DocumentChunkModel)
+            .where(DocumentChunkModel.document_id == document_id)
+            .order_by(DocumentChunkModel.chunk_index)
+        )
+        models = result.scalars().all()
+        return [self._to_entity(model) for model in models]
+    async def delete_by_document_id(self, document_id: UUID) -> int:
+        """Delete all chunks for a document"""
+        result = await self.session.execute(
+            select(DocumentChunkModel).where(DocumentChunkModel.document_id == document_id)
+        )
+        models = result.scalars().all()
+        for model in models:
+            await self.session.delete(model)
+        await self.session.commit()
+        return len(models)
+    def _to_entity(self, model: DocumentChunkModel) -> DocumentChunk:
+        """Convert model to entity"""
+        return DocumentChunk(
+            id=model.id,
+            document_id=model.document_id,
+            chunk_index=model.chunk_index,
+            content=model.content,
+            token_count=model.token_count,
+            vector_id=model.vector_id,
+            metadata=model.metadata,
+            created_at=model.created_at,
+        )

app/main.py ADDED Viewed

	@@ -0,0 +1,84 @@

+"""
+FastAPI Application - Main Entry Point
+"""
+from contextlib import asynccontextmanager
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import JSONResponse
+from app.core.config import get_settings
+from app.core.logging import get_logger, setup_logging
+from app.presentation.api.v1.endpoints import router as api_router
+settings = get_settings()
+setup_logging(settings.log_level)
+logger = get_logger(__name__)
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """Application lifespan manager"""
+    logger.info("application_startup", version=settings.app_version, env=settings.environment)
+    # TODO: Initialize database connection pool
+    # TODO: Initialize Qdrant collection
+    # TODO: Warm up embedding model
+    yield
+    # Cleanup
+    logger.info("application_shutdown")
+app = FastAPI(
+    title=settings.app_name,
+    version=settings.app_version,
+    description="Production-ready RAG backend for corporate employee onboarding",
+    lifespan=lifespan,
+)
+# CORS
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=settings.cors_origins,
+    allow_credentials=settings.cors_allow_credentials,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Include routers
+app.include_router(api_router)
+@app.get("/")
+async def root():
+    """Root endpoint"""
+    return {
+        "service": settings.app_name,
+        "version": settings.app_version,
+        "status": "running",
+        "environment": settings.environment,
+    }
+@app.exception_handler(Exception)
+async def global_exception_handler(request, exc):
+    """Global exception handler"""
+    logger.error("unhandled_exception", error=str(exc), exc_info=True)
+    return JSONResponse(
+        status_code=500,
+        content={"error": "Internal server error", "detail": str(exc)},
+    )
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(
+        "app.main:app",
+        host=settings.host,
+        port=settings.port,
+        reload=settings.debug,
+        log_level=settings.log_level.lower(),
+    )

app/presentation/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Presentation layer"""

app/presentation/api/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """API layer"""

app/presentation/api/v1/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """API v1"""

app/presentation/api/v1/endpoints.py ADDED Viewed

	@@ -0,0 +1,168 @@

+"""
+Presentation Layer - API Endpoints
+"""
+import time
+from datetime import datetime
+from typing import List
+from uuid import uuid4
+from fastapi import APIRouter, Depends, File, Form, HTTPException, UploadFile, status
+from sqlalchemy.ext.asyncio import AsyncSession
+from app.application.dto import DocumentUploadDTO, QueryDTO
+from app.application.services import ChunkingService
+from app.application.use_cases.document_indexing import DocumentIndexingUseCase
+from app.application.use_cases.query_processing import QueryProcessingUseCase
+from app.core.config import get_settings
+from app.core.logging import get_logger, set_correlation_id
+from app.core.metrics import (
+    active_requests,
+    http_request_duration_seconds,
+    http_requests_total,
+    queries_total,
+)
+from app.infrastructure.cache.redis_cache import RedisCache
+from app.infrastructure.external.embedder import SentenceTransformerEmbedder
+from app.infrastructure.external.gemini_llm import GeminiLLM
+from app.infrastructure.external.prompt_builder import DefaultPromptBuilder
+from app.infrastructure.external.qdrant_retriever import QdrantRetriever
+from app.infrastructure.repositories.postgres_repository import (
+    PostgresChunkRepository,
+    PostgresDocumentRepository,
+)
+from app.presentation.api.v1.schemas import (
+    DocumentResponse,
+    HealthResponse,
+    QueryRequest,
+    QueryResponse,
+    SourceSchema,
+)
+router = APIRouter(prefix="/api/v1", tags=["api"])
+logger = get_logger(__name__)
+settings = get_settings()
+# Dependency injection (simplified - in production use proper DI container)
+async def get_query_use_case() -> QueryProcessingUseCase:
+    """Get query processing use case"""
+    # Initialize services
+    embedder = SentenceTransformerEmbedder(settings.embedding_model)
+    retriever = QdrantRetriever(
+        url=settings.qdrant_url,
+        collection_name=settings.qdrant_collection_name,
+        vector_size=settings.qdrant_vector_size,
+        api_key=settings.qdrant_api_key if settings.qdrant_api_key else None,
+    )
+    llm = GeminiLLM(api_key=settings.gemini_api_key, model_name=settings.gemini_model)
+    prompt_builder = DefaultPromptBuilder()
+    cache = RedisCache(redis_url=settings.redis_url)
+    # For now, using a simple reranker (in production use cross-encoder)
+    from app.infrastructure.external.simple_reranker import SimpleReranker
+    reranker = SimpleReranker()
+    return QueryProcessingUseCase(
+        retriever=retriever,
+        reranker=reranker,
+        llm=llm,
+        prompt_builder=prompt_builder,
+        cache=cache,
+    )
+@router.post("/query", response_model=QueryResponse, status_code=status.HTTP_200_OK)
+async def process_query(
+    request: QueryRequest,
+    use_case: QueryProcessingUseCase = Depends(get_query_use_case),
+) -> QueryResponse:
+    """Process user query through RAG pipeline"""
+    start_time = time.time()
+    correlation_id = str(uuid4())
+    set_correlation_id(correlation_id)
+    active_requests.inc()
+    try:
+        logger.info("processing_query", query=request.query_text, department=request.department)
+        # Convert to DTO
+        query_dto = QueryDTO(
+            query_text=request.query_text,
+            department=request.department,
+            user_id=request.user_id,
+            session_id=request.session_id,
+            top_k=request.top_k,
+            temperature=request.temperature,
+            max_tokens=request.max_tokens,
+            filters=request.filters,
+        )
+        # Execute use case
+        response_dto = await use_case.execute(query_dto)
+        # Convert to response schema
+        response = QueryResponse(
+            query_id=response_dto.query_id,
+            answer=response_dto.answer,
+            sources=[
+                SourceSchema(
+                    title=src.title,
+                    content=src.content,
+                    relevance_score=src.relevance_score,
+                    document_id=src.document_id,
+                    chunk_index=src.chunk_index,
+                    metadata=src.metadata,
+                )
+                for src in response_dto.sources
+            ],
+            confidence=response_dto.confidence,
+            processing_time_ms=response_dto.processing_time_ms,
+            tokens_used=response_dto.tokens_used,
+            model=response_dto.model,
+        )
+        # Metrics
+        duration = time.time() - start_time
+        http_requests_total.labels(method="POST", endpoint="/api/v1/query", status="200").inc()
+        http_request_duration_seconds.labels(method="POST", endpoint="/api/v1/query").observe(
+            duration
+        )
+        queries_total.labels(department=request.department, status="success").inc()
+        logger.info("query_processed", query_id=response.query_id, duration_ms=int(duration * 1000))
+        return response
+    except Exception as e:
+        logger.error("query_processing_error", error=str(e), exc_info=True)
+        http_requests_total.labels(method="POST", endpoint="/api/v1/query", status="500").inc()
+        queries_total.labels(department=request.department, status="error").inc()
+        raise HTTPException(status_code=500, detail=f"Query processing failed: {str(e)}")
+    finally:
+        active_requests.dec()
+@router.get("/health", response_model=HealthResponse)
+async def health_check() -> HealthResponse:
+    """Health check endpoint"""
+    return HealthResponse(
+        status="healthy",
+        version=settings.app_version,
+        timestamp=datetime.utcnow(),
+        services={
+            "database": "unknown",  # TODO: Add actual health checks
+            "redis": "unknown",
+            "qdrant": "unknown",
+        },
+    )
+@router.get("/metrics")
+async def metrics():
+    """Prometheus metrics endpoint"""
+    from prometheus_client import CONTENT_TYPE_LATEST, generate_latest
+    return generate_latest()

app/presentation/api/v1/schemas.py ADDED Viewed

	@@ -0,0 +1,82 @@

+"""
+Presentation Layer - Pydantic Schemas for API
+"""
+from datetime import datetime
+from typing import List, Optional
+from pydantic import BaseModel, Field
+class QueryRequest(BaseModel):
+    """Query request schema"""
+    query_text: str = Field(..., min_length=1, max_length=5000, description="User question")
+    department: str = Field(..., description="Department context")
+    user_id: Optional[str] = Field(None, description="User identifier")
+    session_id: Optional[str] = Field(None, description="Session identifier")
+    top_k: int = Field(10, ge=1, le=50, description="Number of results to return")
+    temperature: float = Field(0.7, ge=0.0, le=1.0, description="LLM temperature")
+    max_tokens: int = Field(2048, ge=100, le=4096, description="Max tokens in response")
+    filters: dict = Field(default_factory=dict, description="Additional filters")
+class SourceSchema(BaseModel):
+    """Source citation schema"""
+    title: str
+    content: str
+    relevance_score: float
+    document_id: str
+    chunk_index: int
+    metadata: dict = Field(default_factory=dict)
+class QueryResponse(BaseModel):
+    """Query response schema"""
+    query_id: str
+    answer: str
+    sources: List[SourceSchema]
+    confidence: float
+    processing_time_ms: int
+    tokens_used: int
+    model: str
+class DocumentUploadRequest(BaseModel):
+    """Document upload request schema"""
+    department: str = Field(..., description="Department for the document")
+    metadata: dict = Field(default_factory=dict, description="Additional metadata")
+class DocumentResponse(BaseModel):
+    """Document response schema"""
+    id: str
+    title: str
+    filename: str
+    file_type: str
+    file_size: int
+    department: str
+    status: str
+    uploaded_at: datetime
+    indexed_at: Optional[datetime] = None
+    metadata: dict = Field(default_factory=dict)
+class HealthResponse(BaseModel):
+    """Health check response"""
+    status: str
+    version: str
+    timestamp: datetime
+    services: dict = Field(default_factory=dict)
+class ErrorResponse(BaseModel):
+    """Error response schema"""
+    error: str
+    detail: Optional[str] = None
+    request_id: Optional[str] = None

requirements.txt ADDED Viewed

	@@ -0,0 +1,19 @@

+fastapi==0.109.0
+uvicorn[standard]==0.27.0
+pydantic==2.5.3
+pydantic-settings==2.1.0
+sqlalchemy==2.0.25
+asyncpg==0.29.0
+redis[hiredis]==5.0.1
+qdrant-client==1.7.3
+sentence-transformers==2.3.1
+google-generativeai==0.3.2
+alembic==1.13.1
+celery[redis]==5.3.6
+prometheus-client==0.19.0
+structlog==24.1.0
+tenacity==8.2.3
+httpx==0.26.0
+python-multipart==0.0.6
+aiofiles==23.2.1
+psycopg2-binary==2.9.9