Spaces:

Abdul18
/

Rag_Chatbot

Sleeping

Claude Code - Backend Implementation Specialist Claude Sonnet 4.5 commited on 25 days ago

Commit

36bfe21

1 Parent(s): 9d096d7

Add Docker deployment configuration for Hugging Face Spaces

- Add Dockerfile with Python 3.11 slim base
- Configure app to run on port 7860 (HF standard)
- Add .dockerignore for optimized builds
- Update README with HF Space metadata
- Include all FastAPI app files and dependencies
- Add .env.example for configuration reference

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Files changed (25) hide show

.dockerignore +51 -0
.env.example +16 -0
.gitignore +28 -0
Dockerfile +20 -0
README.md +179 -5
app/__init__.py +0 -0
app/api/__init__.py +0 -0
app/api/routes/__init__.py +0 -0
app/api/routes/chat.py +72 -0
app/api/routes/health.py +60 -0
app/config.py +35 -0
app/db/__init__.py +0 -0
app/db/postgres.py +124 -0
app/db/qdrant.py +96 -0
app/db/schema.sql +22 -0
app/main.py +37 -0
app/models/__init__.py +0 -0
app/models/chat.py +56 -0
app/models/document.py +27 -0
app/services/__init__.py +0 -0
app/services/embeddings.py +39 -0
app/services/generation.py +79 -0
app/services/rag_pipeline.py +88 -0
app/services/retrieval.py +31 -0
requirements.txt +11 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,51 @@

+# Git
+.git
+.gitignore
+.gitattributes
+# Python
+__pycache__
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+venv/
+ENV/
+*.egg-info/
+# Environment
+.env
+.env.local
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# Logs
+*.log
+ingestion.log
+# Testing
+.pytest_cache/
+.coverage
+htmlcov/
+# Documentation
+README.md
+*.md
+# Deployment configs
+Procfile
+railway.json
+vercel.json
+# Scripts
+test-backend.sh
+scripts/
+# OS
+.DS_Store
+Thumbs.db

.env.example ADDED Viewed

	@@ -0,0 +1,16 @@

+# Cohere API
+COHERE_API_KEY=your_cohere_api_key_here
+# Qdrant Vector Database
+QDRANT_URL=https://your-cluster.qdrant.io
+QDRANT_API_KEY=your_qdrant_api_key_here
+QDRANT_COLLECTION_NAME=physical_ai_textbook
+# Neon Postgres
+NEON_DATABASE_URL=postgresql://user:password@host/database
+# Frontend URL (for CORS)
+FRONTEND_URL=http://localhost:3000
+# Environment
+ENVIRONMENT=development

.gitignore ADDED Viewed

	@@ -0,0 +1,28 @@

+# Backend files
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+venv/
+ENV/
+.venv
+# Environment
+.env
+.env.local
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# Testing
+.pytest_cache/
+.coverage
+htmlcov/
+# Logs
+*.log

Dockerfile ADDED Viewed

	@@ -0,0 +1,20 @@

+# Use Python 3.11 slim image
+FROM python:3.11-slim
+# Set working directory
+WORKDIR /app
+# Copy requirements first for better caching
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application code
+COPY app ./app
+# Expose port
+EXPOSE 7860
+# Run the application
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "7860"]

README.md CHANGED Viewed

@@ -1,10 +1,184 @@
 ---
-title: Rag Chatbot
-emoji: 🏃
-colorFrom: yellow
-colorTo: green
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: RAG Chatbot
+emoji: 🤖
+colorFrom: blue
+colorTo: purple
 sdk: docker
+app_port: 7860
 pinned: false
 ---
+# Physical AI RAG Backend
+FastAPI backend for the Physical AI textbook RAG chatbot.
+## Features
+- **RAG Pipeline**: Retrieval-Augmented Generation using Cohere API
+- **Vector Search**: Qdrant for semantic search
+- **Conversation Storage**: Neon Postgres for chat history
+- **Text Selection Context**: Support for querying with selected text
+## Tech Stack
+- FastAPI (Python 3.11+)
+- Cohere API (embeddings + generation)
+- Qdrant Cloud (vector database)
+- Neon Serverless Postgres (conversation storage)
+## Setup
+### 1. Install Dependencies
+```bash
+cd backend
+pip install -r requirements.txt
+```
+### 2. Configure Environment
+Copy `.env.example` to `.env` and fill in your credentials:
+```bash
+cp .env.example .env
+```
+Required environment variables:
+- `COHERE_API_KEY`: Your Cohere API key
+- `QDRANT_URL`: Qdrant cluster URL
+- `QDRANT_API_KEY`: Qdrant API key
+- `NEON_DATABASE_URL`: Neon Postgres connection string
+- `FRONTEND_URL`: Frontend URL for CORS
+### 3. Setup Database
+Run the schema on your Neon database:
+```bash
+psql $NEON_DATABASE_URL < app/db/schema.sql
+```
+### 4. Ingest Content
+Parse MDX files and upload to Qdrant:
+```bash
+python scripts/ingest_content.py
+```
+This will:
+- Parse all 11 chapters from `docs/chapters/`
+- Create ~80-100 semantic chunks
+- Generate embeddings via Cohere
+- Upload to Qdrant
+### 5. Run Server
+```bash
+uvicorn app.main:app --reload --port 8000
+```
+API will be available at `http://localhost:8000`
+## API Endpoints
+### Chat
+**POST /api/chat/query**
+```json
+{
+  "query": "What is Physical AI?",
+  "conversation_id": "uuid-optional",
+  "filters": { "chapter": 1 }
+}
+```
+**POST /api/chat/query-with-context**
+```json
+{
+  "query": "Explain this",
+  "selected_text": "Physical AI systems...",
+  "selection_metadata": {
+    "chapter_title": "Introduction",
+    "url": "/docs/chapters/physical-ai-intro"
+  }
+}
+```
+**POST /api/chat/conversations**
+Create a new conversation.
+**GET /api/chat/conversations/{id}**
+Get conversation with all messages.
+### Health
+**GET /api/health**
+Basic health check.
+**GET /api/health/detailed**
+Detailed health check with database status.
+## Deployment
+### Railway (Recommended)
+1. Create Railway project
+2. Connect GitHub repo
+3. Set environment variables
+4. Deploy command: `uvicorn app.main:app --host 0.0.0.0 --port $PORT`
+### Render
+1. Create new Web Service
+2. Connect GitHub repo
+3. Build command: `pip install -r requirements.txt`
+4. Start command: `uvicorn app.main:app --host 0.0.0.0 --port $PORT`
+## Project Structure
+```
+backend/
+├── app/
+│   ├── main.py              # FastAPI app
+│   ├── config.py            # Settings
+│   ├── models/
+│   │   ├── chat.py         # Chat models
+│   │   └── document.py     # Document models
+│   ├── services/
+│   │   ├── embeddings.py   # Cohere embeddings
+│   │   ├── generation.py   # Cohere generation
+│   │   ├── retrieval.py    # Qdrant search
+│   │   └── rag_pipeline.py # Main RAG logic
+│   ├── db/
+│   │   ├── postgres.py     # Neon client
+│   │   ├── qdrant.py       # Qdrant client
+│   │   └── schema.sql      # Database schema
+│   └── api/
+│       └── routes/
+│           ├── chat.py     # Chat endpoints
+│           └── health.py   # Health endpoints
+├── scripts/
+│   └── ingest_content.py   # Content ingestion
+└── requirements.txt
+```
+## Development
+Run with auto-reload:
+```bash
+uvicorn app.main:app --reload
+```
+View API docs:
+- Swagger UI: http://localhost:8000/docs
+- ReDoc: http://localhost:8000/redoc
+## Cost Estimate
+- Cohere: ~$5-10/month (moderate usage)
+- Qdrant Cloud: Free (1GB tier)
+- Neon Postgres: Free tier
+- Railway: Free (500 hours/month)
+**Total: ~$5-10/month**

app/__init__.py ADDED Viewed

File without changes

app/api/__init__.py ADDED Viewed

File without changes

app/api/routes/__init__.py ADDED Viewed

File without changes

app/api/routes/chat.py ADDED Viewed

	@@ -0,0 +1,72 @@

+from fastapi import APIRouter, HTTPException
+from uuid import UUID
+import logging
+import traceback
+from app.models.chat import (
+    ChatQuery,
+    ChatQueryWithContext,
+    ChatResponse,
+    Conversation
+)
+from app.services.rag_pipeline import RAGPipeline
+from app.db.postgres import PostgresDB
+router = APIRouter()
+rag_pipeline = RAGPipeline()
+db = PostgresDB()
+logger = logging.getLogger(__name__)
+@router.post("/query", response_model=ChatResponse)
+async def query_chat(request: ChatQuery):
+    """Process a chat query."""
+    try:
+        response = rag_pipeline.process_query(
+            query=request.query,
+            conversation_id=request.conversation_id,
+            filters=request.filters
+        )
+        return response
+    except Exception as e:
+        logger.error(f"Error processing query: {str(e)}")
+        logger.error(traceback.format_exc())
+        raise HTTPException(status_code=500, detail=f"Error processing query: {str(e)}")
+@router.post("/query-with-context", response_model=ChatResponse)
+async def query_with_context(request: ChatQueryWithContext):
+    """Process a chat query with selected text context."""
+    try:
+        response = rag_pipeline.process_query(
+            query=request.query,
+            conversation_id=request.conversation_id,
+            selected_text=request.selected_text,
+            filters=request.filters
+        )
+        return response
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Error processing query: {str(e)}")
+@router.post("/conversations", response_model=dict)
+async def create_conversation():
+    """Create a new conversation."""
+    try:
+        conversation_id = db.create_conversation()
+        return {"conversation_id": conversation_id}
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Error creating conversation: {str(e)}")
+@router.get("/conversations/{conversation_id}", response_model=Conversation)
+async def get_conversation(conversation_id: UUID):
+    """Get a conversation with all its messages."""
+    try:
+        conversation = db.get_conversation(conversation_id)
+        if not conversation:
+            raise HTTPException(status_code=404, detail="Conversation not found")
+        return conversation
+    except HTTPException:
+        raise
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Error retrieving conversation: {str(e)}")

app/api/routes/health.py ADDED Viewed

	@@ -0,0 +1,60 @@

+from fastapi import APIRouter, HTTPException
+from app.db.qdrant import QdrantDB
+from app.db.postgres import PostgresDB
+router = APIRouter()
+@router.get("/health")
+async def health_check():
+    """Health check endpoint."""
+    return {
+        "status": "healthy",
+        "service": "Physical AI RAG Backend"
+    }
+@router.get("/health/detailed")
+async def detailed_health_check():
+    """Detailed health check including database connections."""
+    health_status = {
+        "status": "healthy",
+        "service": "Physical AI RAG Backend",
+        "components": {}
+    }
+    # Check Qdrant
+    try:
+        qdrant = QdrantDB()
+        collection_info = qdrant.get_collection_info()
+        health_status["components"]["qdrant"] = {
+            "status": "healthy",
+            "collection": collection_info.dict()
+        }
+    except Exception as e:
+        health_status["status"] = "degraded"
+        health_status["components"]["qdrant"] = {
+            "status": "unhealthy",
+            "error": str(e)
+        }
+    # Check Postgres
+    try:
+        db = PostgresDB()
+        with db.get_connection() as conn:
+            with conn.cursor() as cur:
+                cur.execute("SELECT 1")
+        health_status["components"]["postgres"] = {
+            "status": "healthy"
+        }
+    except Exception as e:
+        health_status["status"] = "degraded"
+        health_status["components"]["postgres"] = {
+            "status": "unhealthy",
+            "error": str(e)
+        }
+    if health_status["status"] != "healthy":
+        raise HTTPException(status_code=503, detail=health_status)
+    return health_status

app/config.py ADDED Viewed

	@@ -0,0 +1,35 @@

+from pydantic_settings import BaseSettings
+from functools import lru_cache
+class Settings(BaseSettings):
+    """Application settings loaded from environment variables."""
+    # Cohere API
+    cohere_api_key: str
+    cohere_embed_model: str = "embed-english-v3.0"
+    cohere_generation_model: str = "command-r-08-2024"
+    # Qdrant
+    qdrant_url: str
+    qdrant_api_key: str
+    qdrant_collection_name: str = "physical_ai_textbook"
+    # Neon Postgres
+    neon_database_url: str
+    # Frontend
+    frontend_url: str = "http://localhost:3000"
+    # Application
+    environment: str = "development"
+    class Config:
+        env_file = ".env"
+        case_sensitive = False
+@lru_cache()
+def get_settings() -> Settings:
+    """Get cached settings instance."""
+    return Settings()

app/db/__init__.py ADDED Viewed

File without changes

app/db/postgres.py ADDED Viewed

	@@ -0,0 +1,124 @@

+import psycopg2
+from psycopg2.extras import RealDictCursor
+from contextlib import contextmanager
+from typing import Optional, List, Dict, Any
+from uuid import UUID, uuid4
+from datetime import datetime
+from app.config import get_settings
+class PostgresDB:
+    """PostgreSQL database client for conversation storage."""
+    def __init__(self):
+        self.settings = get_settings()
+        self.connection_string = self.settings.neon_database_url
+    @contextmanager
+    def get_connection(self):
+        """Context manager for database connections."""
+        conn = psycopg2.connect(self.connection_string)
+        try:
+            yield conn
+            conn.commit()
+        except Exception:
+            conn.rollback()
+            raise
+        finally:
+            conn.close()
+    def create_conversation(self, metadata: Optional[Dict[str, Any]] = None) -> UUID:
+        """Create a new conversation."""
+        conversation_id = uuid4()
+        with self.get_connection() as conn:
+            with conn.cursor() as cur:
+                cur.execute(
+                    """
+                    INSERT INTO conversations (id, metadata)
+                    VALUES (%s, %s)
+                    RETURNING id
+                    """,
+                    (str(conversation_id), psycopg2.extras.Json(metadata or {}))
+                )
+                result = cur.fetchone()[0]
+                return UUID(result) if isinstance(result, str) else result
+    def add_message(
+        self,
+        conversation_id: UUID,
+        role: str,
+        content: str,
+        context_used: Optional[List[str]] = None,
+        metadata: Optional[Dict[str, Any]] = None
+    ) -> UUID:
+        """Add a message to a conversation."""
+        message_id = uuid4()
+        with self.get_connection() as conn:
+            with conn.cursor() as cur:
+                cur.execute(
+                    """
+                    INSERT INTO messages (id, conversation_id, role, content, context_used, metadata)
+                    VALUES (%s, %s, %s, %s, %s, %s)
+                    RETURNING id
+                    """,
+                    (
+                        str(message_id),
+                        str(conversation_id),
+                        role,
+                        content,
+                        context_used or [],
+                        psycopg2.extras.Json(metadata or {})
+                    )
+                )
+                result = cur.fetchone()[0]
+                return UUID(result) if isinstance(result, str) else result
+    def get_conversation(self, conversation_id: UUID) -> Optional[Dict[str, Any]]:
+        """Get a conversation with all its messages."""
+        with self.get_connection() as conn:
+            with conn.cursor(cursor_factory=RealDictCursor) as cur:
+                # Get conversation
+                cur.execute(
+                    "SELECT * FROM conversations WHERE id = %s",
+                    (str(conversation_id),)
+                )
+                conversation = cur.fetchone()
+                if not conversation:
+                    return None
+                # Get messages
+                cur.execute(
+                    """
+                    SELECT * FROM messages
+                    WHERE conversation_id = %s
+                    ORDER BY created_at ASC
+                    """,
+                    (str(conversation_id),)
+                )
+                messages = cur.fetchall()
+                return {
+                    **dict(conversation),
+                    'messages': [dict(msg) for msg in messages]
+                }
+    def get_conversation_history(
+        self,
+        conversation_id: UUID,
+        limit: int = 10
+    ) -> List[Dict[str, Any]]:
+        """Get recent messages from a conversation."""
+        with self.get_connection() as conn:
+            with conn.cursor(cursor_factory=RealDictCursor) as cur:
+                cur.execute(
+                    """
+                    SELECT * FROM messages
+                    WHERE conversation_id = %s
+                    ORDER BY created_at DESC
+                    LIMIT %s
+                    """,
+                    (str(conversation_id), limit)
+                )
+                messages = cur.fetchall()
+                return [dict(msg) for msg in reversed(messages)]

app/db/qdrant.py ADDED Viewed

	@@ -0,0 +1,96 @@

+from qdrant_client import QdrantClient
+from qdrant_client.models import Distance, VectorParams, PointStruct, Filter, FieldCondition, MatchValue
+from typing import List, Dict, Any, Optional
+from app.config import get_settings
+from app.models.document import SearchResult
+class QdrantDB:
+    """Qdrant vector database client."""
+    def __init__(self):
+        self.settings = get_settings()
+        self.client = QdrantClient(
+            url=self.settings.qdrant_url,
+            api_key=self.settings.qdrant_api_key
+        )
+        self.collection_name = self.settings.qdrant_collection_name
+    def create_collection(self, vector_size: int = 1024):
+        """Create the collection if it doesn't exist."""
+        try:
+            self.client.get_collection(self.collection_name)
+            print(f"Collection '{self.collection_name}' already exists")
+        except Exception:
+            self.client.create_collection(
+                collection_name=self.collection_name,
+                vectors_config=VectorParams(
+                    size=vector_size,
+                    distance=Distance.COSINE
+                )
+            )
+            print(f"Created collection '{self.collection_name}'")
+    def upsert_chunks(self, chunks: List[Dict[str, Any]], vectors: List[List[float]]):
+        """Insert or update document chunks with their embeddings."""
+        points = [
+            PointStruct(
+                id=chunk['chunk_id'],
+                vector=vector,
+                payload=chunk
+            )
+            for chunk, vector in zip(chunks, vectors)
+        ]
+        self.client.upsert(
+            collection_name=self.collection_name,
+            points=points
+        )
+    def search(
+        self,
+        query_vector: List[float],
+        limit: int = 5,
+        filters: Optional[Dict[str, Any]] = None
+    ) -> List[SearchResult]:
+        """Search for similar chunks."""
+        # Build filter if provided
+        search_filter = None
+        if filters:
+            conditions = []
+            if 'chapter' in filters:
+                conditions.append(
+                    FieldCondition(
+                        key="chapter_number",
+                        match=MatchValue(value=filters['chapter'])
+                    )
+                )
+            if conditions:
+                search_filter = Filter(must=conditions)
+        # Perform search using query_points
+        results = self.client.query_points(
+            collection_name=self.collection_name,
+            query=query_vector,
+            limit=limit,
+            query_filter=search_filter
+        ).points
+        # Convert to SearchResult models
+        return [
+            SearchResult(
+                chunk_id=result.payload['chunk_id'],
+                chapter_number=result.payload['chapter_number'],
+                chapter_title=result.payload['chapter_title'],
+                section_title=result.payload['section_title'],
+                content=result.payload['content'],
+                content_type=result.payload['content_type'],
+                url=result.payload['url'],
+                score=result.score
+            )
+            for result in results
+        ]
+    def get_collection_info(self) -> Dict[str, Any]:
+        """Get information about the collection."""
+        return self.client.get_collection(self.collection_name)

app/db/schema.sql ADDED Viewed

	@@ -0,0 +1,22 @@

+-- Conversations table
+CREATE TABLE IF NOT EXISTS conversations (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
+    metadata JSONB DEFAULT '{}'::jsonb
+);
+-- Messages table
+CREATE TABLE IF NOT EXISTS messages (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    conversation_id UUID REFERENCES conversations(id) ON DELETE CASCADE,
+    role VARCHAR(20) NOT NULL CHECK (role IN ('user', 'assistant')),
+    content TEXT NOT NULL,
+    context_used TEXT[],
+    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
+    metadata JSONB DEFAULT '{}'::jsonb
+);
+-- Indexes for performance
+CREATE INDEX IF NOT EXISTS idx_messages_conversation_id ON messages(conversation_id);
+CREATE INDEX IF NOT EXISTS idx_messages_created_at ON messages(created_at DESC);
+CREATE INDEX IF NOT EXISTS idx_conversations_created_at ON conversations(created_at DESC);

app/main.py ADDED Viewed

	@@ -0,0 +1,37 @@

+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from app.config import get_settings
+from app.api.routes import chat, health
+# Initialize settings
+settings = get_settings()
+# Create FastAPI app
+app = FastAPI(
+    title="Physical AI RAG Backend",
+    description="RAG-powered chatbot backend for Physical AI textbook",
+    version="1.0.0"
+)
+# Configure CORS
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=[settings.frontend_url, "http://localhost:3000", "https://*.vercel.app"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Include routers
+app.include_router(health.router, prefix="/api", tags=["health"])
+app.include_router(chat.router, prefix="/api/chat", tags=["chat"])
+@app.get("/")
+async def root():
+    """Root endpoint."""
+    return {
+        "message": "Physical AI RAG Backend",
+        "version": "1.0.0",
+        "docs": "/docs"
+    }

app/models/__init__.py ADDED Viewed

File without changes

app/models/chat.py ADDED Viewed

	@@ -0,0 +1,56 @@

+from pydantic import BaseModel, Field
+from typing import Optional, List, Dict, Any
+from datetime import datetime
+from uuid import UUID
+class ChatQuery(BaseModel):
+    """Request model for chat queries."""
+    query: str = Field(..., min_length=1, max_length=1000)
+    conversation_id: Optional[UUID] = None
+    filters: Optional[Dict[str, Any]] = None
+class ChatQueryWithContext(BaseModel):
+    """Request model for chat queries with selected text context."""
+    query: str = Field(..., min_length=1, max_length=1000)
+    selected_text: Optional[str] = None
+    selection_metadata: Optional[Dict[str, Any]] = None
+    conversation_id: Optional[UUID] = None
+    filters: Optional[Dict[str, Any]] = None
+class SourceReference(BaseModel):
+    """Reference to a source document chunk."""
+    chapter_number: int
+    chapter_title: str
+    section_title: str
+    url: str
+    relevance_score: float
+class ChatResponse(BaseModel):
+    """Response model for chat queries."""
+    answer: str
+    sources: List[SourceReference]
+    conversation_id: UUID
+    message_id: UUID
+class Message(BaseModel):
+    """Message model."""
+    id: UUID
+    conversation_id: UUID
+    role: str
+    content: str
+    context_used: Optional[List[str]] = None
+    created_at: datetime
+    metadata: Optional[Dict[str, Any]] = None
+class Conversation(BaseModel):
+    """Conversation model."""
+    id: UUID
+    created_at: datetime
+    metadata: Optional[Dict[str, Any]] = None
+    messages: Optional[List[Message]] = None

app/models/document.py ADDED Viewed

	@@ -0,0 +1,27 @@

+from pydantic import BaseModel
+from typing import Optional, List
+class DocumentChunk(BaseModel):
+    """Model for a document chunk to be indexed."""
+    chunk_id: str
+    chapter_number: int
+    chapter_title: str
+    section_title: str
+    content: str
+    content_type: str  # text, code, callout, quiz
+    url: str
+    keywords: Optional[List[str]] = None
+    word_count: int
+class SearchResult(BaseModel):
+    """Model for a search result from Qdrant."""
+    chunk_id: str
+    chapter_number: int
+    chapter_title: str
+    section_title: str
+    content: str
+    content_type: str
+    url: str
+    score: float

app/services/__init__.py ADDED Viewed

File without changes

app/services/embeddings.py ADDED Viewed

	@@ -0,0 +1,39 @@

+import cohere
+from typing import List
+from app.config import get_settings
+class EmbeddingService:
+    """Service for generating embeddings using Cohere."""
+    def __init__(self):
+        self.settings = get_settings()
+        self.client = cohere.Client(self.settings.cohere_api_key)
+        self.model = self.settings.cohere_embed_model
+    def embed_text(self, text: str) -> List[float]:
+        """Generate embedding for a single text."""
+        response = self.client.embed(
+            texts=[text],
+            model=self.model,
+            input_type="search_document"
+        )
+        return response.embeddings[0]
+    def embed_texts(self, texts: List[str]) -> List[List[float]]:
+        """Generate embeddings for multiple texts."""
+        response = self.client.embed(
+            texts=texts,
+            model=self.model,
+            input_type="search_document"
+        )
+        return response.embeddings
+    def embed_query(self, query: str) -> List[float]:
+        """Generate embedding for a search query."""
+        response = self.client.embed(
+            texts=[query],
+            model=self.model,
+            input_type="search_query"
+        )
+        return response.embeddings[0]

app/services/generation.py ADDED Viewed

	@@ -0,0 +1,79 @@

+import cohere
+from typing import List
+from app.config import get_settings
+from app.models.document import SearchResult
+class GenerationService:
+    """Service for generating responses using Cohere."""
+    def __init__(self):
+        self.settings = get_settings()
+        self.client = cohere.Client(self.settings.cohere_api_key)
+        self.model = self.settings.cohere_generation_model
+    def generate_response(
+        self,
+        query: str,
+        retrieved_chunks: List[SearchResult],
+        selected_text: str = None,
+        conversation_history: List[dict] = None
+    ) -> str:
+        """Generate a response based on retrieved context."""
+        # Build context from retrieved chunks
+        context_parts = []
+        for i, chunk in enumerate(retrieved_chunks, 1):
+            context_parts.append(
+                f"[Source {i}: {chunk.chapter_title} - {chunk.section_title}]\n{chunk.content}"
+            )
+        context = "\n\n".join(context_parts)
+        # Build prompt
+        system_prompt = """You are an AI teaching assistant for the Physical AI and Humanoid Robotics textbook.
+CRITICAL RULES - YOU MUST FOLLOW THESE:
+1. ALWAYS provide a direct, complete answer to the user's question
+2. NEVER ask questions back to the user (NO "Could you clarify...", NO "What specifically...", NO "Please specify...")
+3. If the question is vague, make reasonable assumptions and answer based on the most relevant information in the context
+4. If the answer is not in the context, say "I don't have information about that in the textbook" - but still DON'T ask questions
+5. Provide educational, clear, and concise answers
+6. Use technical terms appropriately and explain them when needed
+7. For code-related questions, provide relevant code snippets from the context
+Remember: Your job is to ANSWER, not to ask for clarification. Always give the best answer you can based on the available context."""
+        user_prompt = f"""Context from the textbook:
+{context}
+"""
+        if selected_text:
+            user_prompt += f"\nUser selected this text: \"{selected_text}\"\n"
+        user_prompt += f"\nQuestion: {query}\n\nAnswer based on the context above:"
+        # Build chat history
+        chat_history = []
+        if conversation_history:
+            for msg in conversation_history[-5:]:  # Last 5 messages for context
+                # Map roles to Cohere's expected format
+                role_mapping = {
+                    "user": "User",
+                    "assistant": "Chatbot"
+                }
+                cohere_role = role_mapping.get(msg['role'], "User")
+                chat_history.append({
+                    "role": cohere_role,
+                    "message": msg['content']
+                })
+        # Generate response
+        response = self.client.chat(
+            model=self.model,
+            message=user_prompt,
+            chat_history=chat_history,
+            preamble=system_prompt,
+            temperature=0.3,
+            max_tokens=1000
+        )
+        return response.text

app/services/rag_pipeline.py ADDED Viewed

	@@ -0,0 +1,88 @@

+from typing import Optional, Dict, Any
+from uuid import UUID
+from app.services.retrieval import RetrievalService
+from app.services.generation import GenerationService
+from app.db.postgres import PostgresDB
+from app.models.chat import ChatResponse, SourceReference
+class RAGPipeline:
+    """Main RAG pipeline orchestrating retrieval and generation."""
+    def __init__(self):
+        self.retrieval = RetrievalService()
+        self.generation = GenerationService()
+        self.db = PostgresDB()
+    def process_query(
+        self,
+        query: str,
+        conversation_id: Optional[UUID] = None,
+        selected_text: Optional[str] = None,
+        filters: Optional[Dict[str, Any]] = None
+    ) -> ChatResponse:
+        """Process a user query through the RAG pipeline."""
+        # Create conversation if needed
+        if not conversation_id:
+            conversation_id = self.db.create_conversation()
+        # Get conversation history
+        conversation_history = self.db.get_conversation_history(conversation_id)
+        # Retrieve relevant chunks
+        retrieved_chunks = self.retrieval.retrieve(
+            query=query,
+            limit=5,
+            filters=filters
+        )
+        # Generate response
+        answer = self.generation.generate_response(
+            query=query,
+            retrieved_chunks=retrieved_chunks,
+            selected_text=selected_text,
+            conversation_history=conversation_history
+        )
+        # Store user message
+        self.db.add_message(
+            conversation_id=conversation_id,
+            role="user",
+            content=query,
+            metadata={
+                "selected_text": selected_text,
+                "filters": filters
+            }
+        )
+        # Store assistant message
+        context_used = [chunk.chunk_id for chunk in retrieved_chunks]
+        message_id = self.db.add_message(
+            conversation_id=conversation_id,
+            role="assistant",
+            content=answer,
+            context_used=context_used
+        )
+        # Build source references with localhost URLs for development
+        sources = [
+            SourceReference(
+                chapter_number=chunk.chapter_number,
+                chapter_title=chunk.chapter_title,
+                section_title=chunk.section_title,
+                url=chunk.url.replace(
+                    "https://physical-ai-textbook.vercel.app",
+                    "http://localhost:3000"
+                ),
+                relevance_score=chunk.score
+            )
+            for chunk in retrieved_chunks
+        ]
+        return ChatResponse(
+            answer=answer,
+            sources=sources,
+            conversation_id=conversation_id,
+            message_id=message_id
+        )

app/services/retrieval.py ADDED Viewed

	@@ -0,0 +1,31 @@

+from typing import List, Optional, Dict, Any
+from app.db.qdrant import QdrantDB
+from app.services.embeddings import EmbeddingService
+from app.models.document import SearchResult
+class RetrievalService:
+    """Service for retrieving relevant document chunks."""
+    def __init__(self):
+        self.qdrant = QdrantDB()
+        self.embeddings = EmbeddingService()
+    def retrieve(
+        self,
+        query: str,
+        limit: int = 5,
+        filters: Optional[Dict[str, Any]] = None
+    ) -> List[SearchResult]:
+        """Retrieve relevant chunks for a query."""
+        # Generate query embedding
+        query_vector = self.embeddings.embed_query(query)
+        # Search in Qdrant
+        results = self.qdrant.search(
+            query_vector=query_vector,
+            limit=limit,
+            filters=filters
+        )
+        return results

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+fastapi
+uvicorn
+python-dotenv
+pydantic
+pydantic-settings
+cohere
+qdrant-client
+psycopg2-binary
+sqlalchemy
+python-multipart
+httpx